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Preface 


This book evolved from the notes of a course that I teach at the Uni- 
versity of Geneva, for undergraduate physics students. For many gen- 
erations of physicists, including mine, the classic references for classical 
electrodynamics have been the textbook by Jackson and that by Lan- 
dau and Lifschits.t The former is still much used, although more modern 
and excellent textbooks with a somewhat similar structure, such as Garg 
(2012) or Zangwill (2013), now exist, while the latter is by now rarely 
used, even as an auxiliary reference text for a course. Because of my 
field-theoretical background, as my notes were growing I realized that 
they were naturally drifting toward what looked to me as a modern ver- 
sion of Landau and Lifschits, and this stimulated me to expand them 
further into a book. 

While this book is meant as a modern introduction to classical elec- 
trodynamics, it is by no means intended as a first introduction to the 
subject. The reader is assumed to have already had a first course on 
electrodynamics, at a level covered for instance by Griffiths (2017). This 
also implies a different structure of the presentation. In a first course 
of electrodynamics, it is natural to take a ‘bottom-up’ approach, where 
one starts from experimental observations in the simple settings of elec- 
trostatics and magnetostatics, and then moves toward time-dependent 
phenomena and electromagnetic induction, which eventually leads to 
generalizing the equations governing electrostatics and magnetostatics 
into the synthesis provided by the full Maxwell’s equation. This ap- 
proach is the natural one for a first introduction because, first of all, 
gives the correct historical perspective and shows how Maxwell’s equa- 
tions emerged from the unification of a large body of observations; fur- 
thermore, it also allows one to start with more elementary mathematical 
tools, for the benefit of the student that meets some of them for the first 
time, while at the same time discovering all these new and fundamental 
physics concepts. The price that is paid is that the approach, following 
the historical developments, is sometimes heuristic, and the logic of the 
arguments and derivations is not always tight. 

For this more advanced text, I have chosen instead a ‘top-down’ ap- 
proach. Maxwell’s equations are introduced immediately (after an intro- 
ductory chapter on mathematical tools) as the ‘definition’ of the theory, 
and their consequences are then systematically developed. This has the 
advantage of a better logical clarity. It will also allow us to always go 
into the ‘real story’, rather than presenting at first a simpler version, to 
be later improved. 


lIn the text we will refer to the latest 
editions of these books, Jackson (1998) 
and Landau and Lifschits (1975). How- 
ever, these books went through many 
editions: the first edition of Jackson ap- 
peared in 1962, while that of Landau 
and Lifschits even dates back to 1939. 


vill 


Preface 


This approach is different from, e.g. 
that of Jackson, or Zangwill. It is 
instead the same followed by Garg, 
and especially by Landau and Lifs- 
chits, that even separated the subjects 
into two different books: “The Clas- 
sical Theory of Fields”, Landau and 
Lifschits (1975), for vacuum electrody- 
namics, and “Electrodynamics of Con- 
tinuous Media”, Landau and Lifschits 
(1984) for electrodynamics in materials. 


3By comparison, Jackson introduces 
the gauge potentials in full general- 
ity for the first time only after about 
220 pages and Zangwill after about 500 
pages, and their introduction is in gen- 
eral presented simply as a trick for sim- 
plifying the equations. However, their 
role is much more fundamental, since 
they are the basic dynamical variables 
in a field-theoretical treatment (which 
also implies that they will become the 
basic variables also when one moves to 
a quantum treatment). As for Special 
Relativity, Jackson introduces it only 
after more than 500 pages, while Zang- 
will relegates it to Chapter 22, after 820 
pages, and Garg to Chapter 24. 


An important aspect of our presentation is that we keep distinct the 
discussion of electrodynamics ‘in vacuum’ (i.e., the computation of the 
electromagnetic fields generated by localized sources, in the region out- 
side the sources) from the study of Maxwell’s equations inside mate- 
rials. The study of the equations ‘in vacuum’ reveals the underlying 
fundamental structure of the theory, while classical electrodynamics in 
material media is basically a phenomenological theory. Mixing the two 
treatments, because of a formal similarity among the equations, can be 
conceptually confusing. Until Chapter 12 we will focus uniquely on vac- 
uum electrodynamics, while from Chapter 13 we study electrodynamics 
in materials.” 

Focusing first on vacuum electrodynamics allows us to bring out the 
two most important structural aspects of the theory at its fundamental 
level, namely gauge invariance and the fact that Special Relativity is 
hidden in the Maxwell’s equations. We will introduce immediately and 
in full generality the gauge potentials, and work out most of the equa- 
tions and derivations of vacuum electrodynamics in terms of them. From 
a modern field-theoretical perspective, we know that classical electrody- 
namics is the prototype of a gauge theory, and the notion of gauge fields 
and gauge invariance is central to all modern particle physics, as well as 
to condensed matter theory. Similarly, after having duly derived from 
Maxwell’s equations the most elementary results of electrostatics and 
magnetostatics, as well as the notions of work and electromagnetic en- 
ergy and the expansion in static multipoles, we move as fast as possible 
to Special Relativity, introducing the covariant formalism and showing 
how Maxwell’s equations can be reformulated in a covariant form.? Hav- 
ing in our hands the gauge potentials and the covariant formalism, most 
of the subsequent derivations in Chapters 8-12 are performed in terms 
of them, with a clear advantage in technical and conceptual clarity. 

Even if this book was born from my notes for an undergraduate course, 
and is meant to be used for such a course, it has obviously grown well 
beyond the original scope, and some parts of it are quite advanced. 
More technical sections, or whole chapters that are more specialized, 
are clearly marked, so that the book can be used at different levels, 
from the undergraduate student, to the researcher that needs to check a 
textbook as a reference. Classical electrodynamics, for its richness and 
importance, is a subject to which one returns over and over during a 
scientist’s career. 

Finally, an important point, when writing a textbook of electrodynam- 
ics, is the choice of the system of units. In mechanics, the transforma- 
tion between systems such as c.g.s. (centimeter-gram-second) and m.k.s. 
(meter-kilogram-second) is trivial, and just amounts to multiplicative 
factors. However, in electromagnetism there are further complications. 
This has led to two main systems of units for classical electrodynamics: 
the SI system, and the Gaussian system. As we will discuss in Chapter 2, 
the essential difference is that, for electromagnetism, the SI system be- 
side the units of length, mass and time, introduces a fourth independent 
base unit of current, the ampere, while in the Gaussian system the unit 


of charge, and therefore of current, is derived from the three basic units 
of length, mass and time. 

The SI system is the natural one for applications to the macroscopic 
world: currents are measured in amperes, voltages in volts, and so on. 
This makes the SI system the obvious choice for laboratory applications 
and in electrical engineering, and SI units are by now the almost uni- 
versal standard for electrodynamics courses at the undergraduate level. 
The Gaussian system, on the other hand, has advantages in other con- 
texts, and in particular leads to neater formulas when relativistic effects 
are important.* 

This state of affairs has led to a rather peculiar situation. In general, 
undergraduate textbooks of classical electromagnetism always use SI 
units; in contrast, more advanced textbooks of classical electrodynamics 
are often split between SI and Gaussian units, and all textbooks on 
quantum electrodynamics and quantum field theory use Gaussian units. 
The difficulty of the choice is exemplified by the Jackson’s textbook, 
that has been the ‘bible’ of classical electrodynamics for generations of 
physicists. The second edition (1975), as the first, used Gaussian units 
throughout. However, the third edition (1998) switched to SI units for 
the first 10 chapters, in recognition of the fact that almost all other 
undergraduate level textbooks used SI units; then, from Chapter 11 
(Special Theory of Relativity) on, it goes back to Gaussian units, in 
recognition of the fact that they are more appropriate than SI units 
for relativistic phenomena.” Gaussian units are also the most common 
choice in quantum mechanics textbooks: when computing the energy 
level of the hydrogen atom, almost all textbooks use a Coulomb potential 
in Gaussian units, —e?/r, rather than the SI expression —e?/(4megr).° 

In this book we will use SI units, since this is nowadays the almost 
universal standard for an undergraduate textbook on classical electrody- 
namics. However, it is important to be familiar also with the Gaussian 
system, as a bridge toward graduate and more specialized courses. This 
is particularly important for the student that wishes to go into theo- 
retical high-energy physics where, eventually, only the Gaussian system 
will be used. We will then discuss in Section 2.2 how to quickly trans- 
late from SI to Gaussian units, and, in Appendix A, we will provide an 
explicit translation in Gaussian units of the most important results and 
formulas of the main text. 


Finally, I wish to thank Enis Belgacem, Francesco Iacovelli and Michele 
Mancarella, who gave the exercise sessions of the course for various years, 
and Stefano Foffa for useful discussions. I thank again Francesco Iacovelli 
for producing a very large number of figures of the book. I am grateful 
to Stephen Blundell for extremely useful comments on the manuscript. 
Last but not least, as with my previous books with OUP, I wish to thank 
Sonke Adlung, for his friendly and always very useful advice, as well as 
all the staff at OUP. 


Geneva, January 2023 


ix 


“ Actually, its real virtues appear 
when combining electromagnetism with 
quantum mechanics. In this case, the 
reduction from four to three base units 
obtained with the Gaussian system can 
be pushed further, using a system of 
units where one also sets h = c = 1, 
with the result that one remains with a 
single base unit, typically taken to be 
the mass unit. In quantum field the- 
ory this system is so convenient that 
units A = c= 1 are called natural units 
(we will briefly mention them in Sec- 
tion 2.2). As a consequence, all general- 
izations from classical electrodynamics 
to quantum electrodynamics (and its 
extensions such as the Standard Model 
of electroweak and strong interactions) 
are nowadays uniquely discussed using 
the Gaussian system (or, rather, a vari- 
ant of it, Heaviside—Lorentz or rational- 
ized Gaussian units, differing just by 
the placing of some 4r factors, that we 
will also introduce in Chapter 2), sup- 
plemented by units h = c= 1. 


5 Among the other ‘old-time’ classics, 
Landau and Lifschits (1975) used Gaus- 
sian units, while the Feynman Lectures 
on Physics, Feynman et al. (1964), used 
SI. The first two editions of the clas- 
sic textbook by Purcell used Gaussian 
units, but switched to SI for the 3rd edi- 
tion, Purcell and Morin (2013). Among 
more recent books, SI is used in Grif- 
fiths (2017), Zangwill (2013) and Tong 
(2015), while Gaussian units are used in 
Garg (2012) (with frequent translations 
to SI units). 


©The most notable exception is the 
quantum mechanics textbook Griffiths 
(2004), that uses SI units, consistently 
with the classical electrodynamics book 
by the same author. 
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Mathematical tools 


Classical electrodynamics requires a good familiarity with a set of math- 
ematical tools, which will then find applications basically everywhere in 
physics. We find it convenient to begin by recalling some of these con- 
cepts so that, later, the understanding of the physics will not be obscured 
by the mathematical manipulations. We will focus here on tools that will 
be of more immediate use. Further mathematical tools will be discussed 
along the way, in the rest of the book, as they will be needed. 


1.1 Vector algebra 


A vector a has Cartesian components a;. We use the convention that 


repeated indices are summed over, so, 


a-b = 5 aibi = aibi, (1.1) 
u 

where the sum runs over 7 = 1, 2,3 in three spatial dimensions, as we will 

assume next, or, more generally, over 7 = 1,...,din d spatial dimensions. 

We can introduce the “Kronecker delta” 6;;, which is equal to 1 if i = j 

and zero otherwise. Note that, with the convention of the sum over 

repeated indices, we have the identity 


Qi = bij; . (1.2) 


Then, we can also rewrite 


From the definition it follows that, in three dimensions, 6;; = 3. Note 
that 6;; are just the components of the 3 x3 identity matrix I, ôi; = (IJiz 
and 6;; is the trace of the 3 x 3 identity matrix (or, in d dimensions, 6;; 
are the components of the d x d identity matrix, and 6;; = d). 

When using the convention of summing over repeated indices, one 
must be careful not to use the same dummy index for different summa- 
tions. For example, writing the sums explicitly, 


3 3 
(a-b)(c-d) = (>: ot 2 od 


With the convention of the sum over repeated indices, the right-hand 
side becomes a;bicjdj. Notice that it was important here to use two 
different letters, i, j, for the dummy indices involved. 


(1.4) 
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Groups and representations 
18 


2 Mathematical tools 


The vector product is given by 
(a x b); = Eijk@jbk 5 (1.5) 
where €;;, is the totally antisymmetric tensor (or the Levi-Civita ten- 


sor), defined by €123 = +1, together with the condition that it is an- 
tisymmetric under any exchange of indices. Therefore €;;, = 0 if two 


indices have the same value and, e.g., €213 = —€123 = —l. This also 
implies that the tensor is cyclic, €&ijk = —€jik = €jki, 1.€., it is unchanged 
under a cyclic permutation of the indices. Note that a x b = —b x a. 


We will see in Section 1.6 that the tensors 4;; and €;;~ play a special role 
in the theory of representations of the rotation group. Unit vectors are 
denoted by a hat; for instance, x, y, and Z are the unit vectors along 
the x,y and z axes, respectively. Note that, e.g., 


RX y=—zs. (1.6) 
A very useful identity is 
CijkEilm = OjlOkm — OjmOkl - (lit) 
(Prove it!) Note the structure of the indices: on the left, the index 7 is 
summed over, so it does not appear on the right-hand side. It is a dummy 
index, and we could have used a different name for it. For instance, the 
left-hand side of eq. (1.7) could be written as €p;~€pim, With a different 
letter p. In contrast, the indices j,k,/,m are free indices so, if they 
appear on the left-hand side, they must also appear on the right-hand 
side. Note also that, because of the cyclic property of the epsilon tensor, 
the left-hand side of eq. (1.7) can also be written as €jki€ilm- 
Exercise 1.1 Show that 
a(b x c) = ijn Aid; CK . (1.8) 
Exercise 1.2 Using eq. (1.7), show that 


ax(bxc) = (ac)b-— (ab)c, (1.9) 
(axb)xe = (ac)b-— (b-c)a. (1.10) 


Observe that the vector product is not associative: in general, a x (b xc) 
is different from (a x b) x c. 


Exercise 1.3 Using eq. (1.7), show that 
EijkEijm = 20 . (1.11) 


Double-check the result by directly identifying the combinations of in- 
dices that give a non-zero contributions to the left-hand side of eq. (1.11). 
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1.2 Differential operators on scalar and 
vector fields 


We will use the notation 0; = 0/0x' for the partial derivative with 
respect to the Cartesian coordinates x’. Then, if f(x) is a function of the 
spatial coordinates, its gradient V f is a vector field (i.e., a vector defined 
at each point of space) whose components, in Cartesian coordinates, are 
given by 


(Vf): = Of (gradient of a scalar function) , (1.12) 
or, in vector form, 
Vif = (sf) + (Oyf)¥ + (0: f). (1.13) 


The expression in polar coordinates (r, 0, ¢) can be obtained by perform- 
ing explicitly the transformation between the derivatives ô+, ôy, 0- and 
the derivatives 0, = 0/Or,09 = 0/09, and 0g = 0/0, and expressing 
g, y,Z in terms of the unit vectors f, Ô, o (and similarly in any other co- 
ordinate system, such as cylindrical coordinates); we will give the results 
in polar ad cylindric coordinates for different operators at the end of this 
section. Given a vector field v(x), we can form two notable quantities 
with the action of V: the divergence 


V-v = ivi (divergence of a vector field) , (1.14) 


which is a scalar field (i-e., a quantity invariant under rotations, defined 
at each point of space), and the curl, V x v, which is again a vector 
field, with Cartesian components 


(V X v)i = €ijkĝjUk (curl of a vector field) . (1.15) 


Given a function f, after forming the vector field V f, we can obtain 
again a scalar by taking the divergence of V f. This defines the Laplacian 
V?, Vf =V (Vf), or 


V?’ f = dO; f = (02 + OF +), (1.16) 


where 0, = 0/0zx, etc. Similarly, we can differentiate further the diver- 
gence or the curl of a vector field. For instance, V(V-v) is a vector field 
with components 


[V(V-v)]i = 0:05, , (1.17) 


while V-(V x v) is a scalar field, since it is the divergence of a vector 
field. However, from the explicit computation in components, 


V(V xv) = 0; (€:jn0; Vk) 
Eijk0iOjVk 
0. (1.18) 
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‘Our convention on the polar angles is 
such that spherical coordinates are re- 
lated to Cartesian coordinates by x = 
rsinOcos¢, y = rsin@ésing, z = 
rcos@, with 0 € [0,7] and ¢ € [0,27]. 
Cylindrical coordinates are related to 
Cartesian coordinates by £x = pcosy, 
y = psing (with ọ € (0, 27]) and z = z. 
Then, as y increases, we rotate coun- 
terclockwise with respect to the z axis. 
This means that px @ = +2. 


2The Laplacian of a vector field has 
been defined from eq. (1.20) in terms of 
the Cartesian coordinates and, in this 
case, on each component of the vec- 
tor field, it has the same form as the 
Laplacian acting on a scalar. This is no 
longer true in polar of cylindrical coor- 
dinates. In that case, V?v can be more 
easily obtained from eq. (1.21), using 
the corresponding expressions of V-v 
and V x v. 


The last equality follows because 0;0; is an operator symmetric in the 
(i, j) indices (we always assume that the derivatives act on functions, 
or on vector fields, that are continuous and infinitely differentiable ev- 
erywhere, so the derivatives commute, 0;0; = 0;0;), and therefore it 
gives zero when contracted with the antisymmetric tensor €ijķ. Thus, 
the gradient of a curl vanishes. Similarly, the curl of a gradient vanishes: 


(V x Vf)i = EijkðjOk f =0, (1.19) 


again because 0,0, is symmetric in (j,k). The Laplacian of a vector 
field is defined, using Cartesian coordinates, as 


(V°v); = 0,050; = (0? + a + 0?)u; . (1.20) 
Exercise 1.4 Using eq. (1.7), show that 
Vx(Vxv)=V(V-v)—V?v. (1.21) 


For future reference, we give the expression of the gradient and Lapla- 
cian of a scalar field, and of the divergence and curl of a vector field, 
in Cartesian coordinates (a,y,x), in polar coordinates (r,6,¢), and in 
cylindrical coordinates (p, p, z).!°2 Denoting by {&, ¥,2}, {ĉ, ð, $}, and 
{p, p, 2}, respectively, the unit vectors in the corresponding directions, 
we have 


Vf = (Ox f)R+ (yA + (Oz f)2 (1.22) 
ie a, 1 n 
= (AP + (of) + 0s (1.23) 
soor E 7 
= (Oof) + 7 Oof) P + (8f), (1.24) 
Vf = KF+UF+ OS (1.25) 
_ 1 2 1 . 1 2 
= 2 O,(r a-f) + r2 zig oO sin 006 f) + ane 6 sin? gf (1.26) 
1 1 
= poe POs) s ood ad (1.27) 
Vv = Ozvy + OyVy + Ozvz (1.28) 
= 1 2 : 
= “2 Or(r Up) + ragg (v8 sin 0) + ng tue (1.29) 
1 
= —0, (pup) + -3pVe +0,¥y, (1.30) 
Vxv = (yv: — zvy) + ¥(OzU2 — Orv) + Z(OzVy — Oyvz) (1.31) 
. l ; 4 1 1 
= te [3o (ve sin 0) — Ogve] + 8 | agers r (rug) 
~1 
+o- [Or(rve) — bvr] (1.32) 


(1 : wal 
=p (Cave. — 2.09) + È (3zVp = pvz) + Ê a [3 (pvo) — Otis | 
(1.33) 
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1.3 Integration of vector fields. Gauss’s 
and Stokes’s theorems 


Given a vector field v(x) and a curve C, one can define the line integral 


[eer (1.34) 


by breaking the curve over infinitesimal segments, and introducing, in 
each segment, a vector dé of length d£, tangent to the curve. Notice that 
the line integral defined in this way is a scalar quantity. If the curve C 
is closed, the line integral (1.34) is known as the circulation of v around 
the curve. For a closed curve, we will denote the line integral by $ dé. 
The integral over a two-dimensional surface S can be defined similarly. 
We split the surface in infinitesimal surface elements of area ds,? and 
we define ds as the vector of modulus ds, pointing in the direction per- 
pendicular to the surface element (for a closed surface, the convention 
is to choose the outward normal, otherwise a choice of orientation must 
be made). Writing the unit vector normal to the surface as ñ, we have 
ds = nds. The surface integral of a vector field v(x) is then given by 


[x= [as (v-n). (1.35) 


For a closed surface, this defines the flux of v through S. In the case of 
a closed surface, we will denote the surface integral by ¢ ds. 

The fundamental theorem of calculus states that, for a function of a 
single variable x, 


T2 df 
dF = fos) — f(r). (1.36) 
Tı T 

This can be generalized to the line integral of a function of the three- 
dimensional variable x: from the definition of the line integral (1.34) 
one can show (do it!) that, for a function f(x) integrated over a curve 
C with endpoints x; and x2, 


[evi = F) fe). (1.37) 


Note in particular that, if C is closed, the line integral of a gradient 
vanishes. Stokes’s theorem and Gauss’s theorem are generalizations of 
eq. (1.37) to surfaces and to volumes, respectively. In particular, let C 
be a closed curve and let S be any surface that has C' as its boundary 
(i.e., OS = C, where the notation 0S stands for the boundary of S$). 
Then, Stokes’s theorem asserts that, for a vector field v(x) (with our 
usual assumptions of differentiability, that we will not repeat further), 


| ds: (V xv) = f dl-v  (Stokes’s theorem) . (1.38) 
s c 


3When we want to stress that this is 
a two-dimensional surface element, we 
will write it as d?s. Otherwise, to sim- 
plify the notation, we write simply ds. 


6 Mathematical tools 


The orientation convention is that, if we go around the loop C in the 
direction of the line integral, the normal to S is obtained with the right- 
hand rule. 

Another useful identity is obtained by setting, in Stokes’s theorem, 


v(x) = w(x)w, where w is a constant vector. Then, (V x v); = 
€ijh(Oj;) Wz and eq. (1.38) becomes 
w f dsi€ijkðj Y = w f dlk w. (1.39) 
S c 


Since this is valid for generic w, we get 


f dsn =$ dey w, (1.40) 
S C 


or, in vector notation, 


[es xVý= $ dev. (1.41) 


Yet another useful identity following from Stokes’s theorem is obtained 
by setting v(x) = u(x)xw, where, again, w is a constant vector. Then, 


(Vxv)i =  €ijnO; (EkimuWm) 
EijkEklm (djur) Wm 
= (0; u;)w; = (O;u;)w; 3 (1.42) 


where, in the last line, we used eq. (1.7). Then eq. (1.38) gives 


wees É dliuj = fasi [(3jui)wj — (O;u;) wi] 
Ç S 


Wk ff ds; anus | dsx Bu , (1.43) 
s s 


conf dliuj = | ds; Okui = | dsk OU; . (1.44) 
c S S 


A useful application of this formula is obtained choosing u;(x) = xi. 
Then kui = ôik and O;u; = 3, so eq. (1.44) gives 


can $ dle, = -2f dsk . (1.45) 
C S 


However, for a planar surface 


f dsk = Any 3 (1.46) 
Ss 


where A is the area of the surface and n is the unit vector normal to 
it. We therefore obtain an elegant formula for the area A of a planar 
surface S, bounded by a curve C, 


and therefore 


An = TEZI (1.47) 
2 C 


1.4 Dirac delta 7 


Gauss’s theorem extends Stokes’s theorem further, to integration over 
volumes: let V be a finite volume bounded by the surface S, i.e., OV = S. 
Then, 


J aVev = / ds-v  (Gauss’s theorem). (1.48) 
v s 


We will make use very often of both Gauss’s and Stokes’s theorems.* 4Proceeding as for Stokes’s theorem, if 


A vector field such that V x v = 0 everywhere is called irrotational, we set v(x) = %(x)w, with w constant, 
or curl-free. We have seen in eq. (1.19) that, if v is the gradient of a 2/80 get the useful identity 
function, v = Vf, then it is irrotational. A sort of converse of this | Ba Vy =i dsw, (1.49 

v s 


statement holds: 
while, setting v(x) = u(x)xw, we get 


Theorem for curl-free fields. Let v be a vector field such that I dsyxu= | dsxu. (1.50 
V x v = 0 everywhere in a region V simply connected (i.e., such that K 8 
every loop in V can be continuously shrunk to a point). Then, there 
exists a function f such that v = Vf. 


A vector field v such that V-v = 0 is called solenoidal, or divergence- 
free. Similarly, there is a sort of inverse to eq. (1.18): 


Theorem for divergence-free fields. Let v be a vector field such 
that V-v = 0 everywhere in a volume V such that every surface in V 
can be continuously shrunk to a point. Then, there exists a vector field 
w such that v = V x w. 


1.4 Dirac delta 


The Dirac delta is an especially useful mathematical object, that appears 
everywhere in physics. Physically, it can be seen as the modelization of 
a point-like object. The Dirac delta ô(x) is not a function in the proper 
sense. Rather, in one dimension, it is defined from the conditions that 


d(a — zo) =0 if Axo, (1.51) 


and that, for any function f(x) regular in an integration region J that 
includes xo, 

| dex 6( — sofle) = f (20). (1.52) 

I 

Note that the integral on the left-hand side vanishes if J does not include 
xo because of eq. (1.51) [and of the assumed regularity of f(a)]. On the 
other hand, again because of eq. (1.51), the integral in the left-hand side 
of eq. (1.52) is independent of J, as long as xo € I. In the following we 
will set for definiteness 7 = (—oo, +00), so eq. (1.52) reads 


+00 
J dien =: (1.53) 


—co 
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n> => o,70 


n —> oo 


| 


Fig. 1.1 A sequence of approxima- 
tions to the Dirac 6 function, us- 
ing the gaussians (1.55) (top panel) 
or the “rectangles” (1.56) (lower 
panel). 


Observe that, applying this definition to the case of the function f(a) = 
1, we get the normalization condition 


+00 

J dx (a) =1. (1.54) 
=o 

Since the Dirac delta vanishes at all points x 4 xo, but still the integral 
in the left-hand side of eq. (1.53) is non-zero, it must be singular in zo. 
Actually, the Dirac delta is not a proper function, but can be defined 
by considering a sequence of functions n(x, xo) such that, as n — oo, 
On(@,%o) —> 0 for x Æ xo, and 6,(x,2%9) + +00 for x = zo, while 
maintaining the normalization condition (1.54). As an example, one can 
take a sequence of gaussians centered on xo, with smaller and smaller 
width, i 


V 27 On 


with on =1/n. Another option could be to use 


n for |x — zo| < 1/(2n) 

n= { 0 for i = ito| > 1/(2n). (86) 
These two sequences of functions are shown in Fig. 1.1. In both cases, 
the limit of ôn (x — zo) for n > co does not exists, since it diverges when 
x = zo, and therefore does not define a proper function 6(a). However, 
one can generalize the notion of functions to the notion of distributions 
(or “improper functions”), which are defined from their action inside 
an integral, when convolved with “test” functions f(x) (with suitably 
defined properties of regularity and, possibly, behavior at infinity). In 
the case of the Dirac delta, the definition in the sense of distributions is 
given by eq. (1.53). Using the explicit expression of the functions 6,,(2) 
given in eq. (1.56), for a function f(a) regular near xo, we get 


e740)" (207) (1.55) 


Ôn(£ — z0) = 


+00 Lot sr 
lim dx ôn (x£ — xo) f(x) = lim n f dx f (x) 


n—> o0 n— oo 


TE E E 
= f(zo). (1.57) 


Therefore, in the sense of distributions, i.e., after multiplying by a 
smooth function f(x) and integrating, we have 


d(@ — z0) = lim Ôn(T — £0). (1.58) 


From the definition, we see that the Dirac delta only makes sense when 
it appears inside an integral. In physics, however, with an abuse of nota- 
tion, the universal use is to treat it as if it were a normal function (and 
it is even called the Dirac delta “function”!), with the understanding 
that the relations in which it enters must be understood in the sense of 
distributions, i.e., multiplied by a test function and integrated. 


From the definitions (1.55) or (1.56), setting 7 = 0, we see that 
the Dirac delta function is even in x, (x) = 6(—ax). Two more useful 
properties are left as an exercise to the reader: 


Exercise 1.5 Using the definition (1.53) show that, if a is a non-zero 
real number, 


Nati eh: (1.59) 


lal 


Exercise 1.6 Using again eq. (1.53) show that, if g(x) is a function 
which has only one simple zero in £z = xo, then 


1 


6 (g(x)) = ——0(a — x0), 1.60) 
where g'(x) = dg/dx. Show that, if g(a) has several simple zeros at the 
points x = zx; (i =1,...,n), this generalizes to 


iue) => ree = (1.61) 


Exercise 1.7 Show that, in the sense of distributions, 


x6(x) =0. (1.62) 


Another useful notion is the derivative of the Dirac delta, 8 (x), which, 
in the sense of distributions, is defined from 


+00 +00 
J vase < -f dier 


—oo =o 


=f’ (zo). (1.63) 


This definition is clearly motivated by the analogy with the integration 
by parts of a normal function. Taking ô(x) as the limit for n + oo (in the 
sense of distributions) of a series ô» (x) of continuous and differentiable 
functions, such as the gaussians (1.55), one can see that ô’ (x) is obtained, 
again in the sense of distributions, from the limit n — oo of ô (x). 
Indeed, for ôn (x) differentiable (and vanishing at x = too), the standard 
integration by parts goes through,” and 


+00 +00 


im, drô (z —2o) f(z) = — lim dx 5n(x — xo) f'(x) 
= —f' (xo), (1.64) 


where the last equality follows from eq. (1.58). Therefore, in the same 
sense as eq. (1.58), 


6'(a — zo) = lim 6) (x@— zo). (1.65) 
n— o0 
Notice that (x) is an odd function of x, 6/(—a) = —0' (x), as it is 


clear from its representation in terms of ô (x), with 6,,(x) given by the 
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5We also assume that the test functions 
f(x) go to zero at too. In fact, here 
it is sufficient that they do not grow 
so fast to compensate the exponential 
decay of the gaussian ôn (a), so that, in 
the integration by parts, we can discard 
the boundary terms at infinity. 
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®Note that we have proved eq. (1.68) 
for x > 0 and for x < 0. Whether it also 
holds for x = 0 depends on the series 
n(x) that we use for approximating 
6(a), and on how we define 6(a = 0). 
If we use functions ôn (x) that are sym- 
metric around x = 0, as in eqs. (1.55) 
or (1.56), the equality holds if we define 
0(0) = 1/2. However, one could use dif- 
ferent, non-symmetric series ôn (x), and 
one should then assign a different value 
to 0(0) for the equality to hold. In fact, 
the whole issue is irrelevant since the 
relation (1.70) holds only in the sense 
of distributions, i.e., after integrating, 
and the fact that it holds in a single 
point or not does not affect any inte- 
grated relation. 


"For instance, by completing the square 
in the exponent one gets 


+oo 
f dge” 47? @ ~ikx 


—co 
k2? too ik \2 
=e 2m? dre” TOt) 
—co 
zik Tee 2 
=e 2n2 dz’ Pera ’) 
— oo 
-42 vVar 
=e anz VT, (1.72) 
n 
where we introduced a! = a + ik/n?. 


More precisely, we have actually con- 
sidered a closed contour in the complex 
plane z = x + iy, composed by the x 
axis y = 0, and by the parallel line 
y = ik/n?, and closed the contour join- 
ing these two lines at infinity. Since the 
integrand has no singularity inside this 
contour, by the Cauchy theorem the in- 
tegral over the x axis is the same as the 
integral over the line y = ik/n?, i.e., 
over the variable z = z + ik/n?. 


sequence of gaussians. Higher-order derivatives of the Dirac delta are 
defined similarly, 


+00 m B66 n Fe 
J dx [note wo)! f(z) = uf dese — r) 2) 
= cr PIO), ; (1.66) 


da” 
There is an interesting relation between the Dirac delta and the Heav- 
iside theta function (also called simply the “theta function”), defined 


by 
1 for «>0 
a -{ 0 for x<0. eo 
Observe that, from the definition of the Dirac delta, 
J ax lx’) = 0(x). (1.68) 


Indeed, if x < 0, the integration region does not include the point xz’ = 0 
where ô(x') is singular, so the integral vanishes. For x > 0, instead, 
we can extend the integral in eq. (1.68) up to z’ = oo, since anyhow 
ô(x') = 0 for x’ > 0, and we can then use eq. (1.54) to show that the 
integral is equal to one. Conversely, for a differentiable function f(x) 
that vanishes at infinity, treating 6’(x) as a distribution and defining its 
derivative as we have done for the Dirac delta, 


f TENORE = f a mer) 


= — d — 
f 7 de 
= —If(co) — f(0)] 
f(0). (1.69) 
This shows that, in the sense of distributions, 
O' (x) = (x), (1.70) 


which could also have been formally derived by taking the derivative 
of eq. (1.68).° This result could have also been proved using a sequence 
bn (x) of approximations to the Dirac delta, and showing that, plugging it 
on the left hand side of eq. (1.68), we get a continuous and differentiable 
approximation to the theta function. 

The Dirac delta has an extremely useful integral representation. A 
simple way to obtain it is to use the sequence of gaussians (1.55). Then, 


+00 +00 
/ dtin(aje ** = = | dg e732? ike 
Log a hee: 
= eF/Qn’) | (1.71) 


as can be proven by carrying out the integral on the right-hand side of 
the first line.” Similarly, 


[- dk e` anz tike n eran et 
z 2H V 2T 
= d(x). (1.73) 
Taking the limit n — oo of these relations we therefore get 
+00 , 
/ dx 6(a)e"** = 1, (1.74) 
and its inverse relation 
+S dk 
J — ekt — f(x). (1.75) 
ga 2 


Equation (1.74) could have been derived more simply from the defining 
property of the Dirac delta, eq. (1.53), observing that, in z = 0, e~** = 
1. Equation (1.75) was less evident, and provides a very useful integral 
representation of the Dirac delta. In Section 1.5, we will use it to give 
a simple derivation of the inversion formula of the Fourier transform. 
The generalization to more than one dimension is straightforward. In 
particular, in three spatial dimensions, we define the three-dimensional 
Dirac delta as 
5 (x) = 6(x)5(y)4(z), (1.77) 


and this is a distribution to be multiplied by test functions f(x) and 
integrated over dx. Then, eq. (1.53) becomes 


[8:26 (% — x0) £60) = Flao), (1.78) 
while the integral representation (1.75) becomes 
Bk x 
tex — §(3) (x) , 1. 
J r)? e (x) (1.79) 


Example 1.1 Divergence of f/r?. As a particularly important appli- 
cation of the concepts developed in Sections 1.2, 1.3, and in the present 
section, we perform the computation of the divergence of the vector field 
x 
v(x) = = 
@ = 4 
r 


r2?’ 


(1.80) 


where r = |x| and Ŷ is the unit vector in the radial direction. In polar 
coordinates, v, = 1/r?, vg = vg = 0. If we use eq. (1.29), we apparently 
get 


V-v 
(1.81) 
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8Note that, renaming the integration 
variable k — —k, we can also write 


eq. (1.75) as 


too dk 
J Ee 
za 27 


=the — óg).  {1.76) 
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Recall that the infinitesimal solid an- 
gle dQ is defined from the transforma- 
tion from Cartesian to spherical coor- 
dinates. In two dimensions, from gz = 
rcos@ and y = rsin@, it follows that 
dxdy = rdrd@. Similarly, in three di- 
mensions, the relation between Carte- 
sian and spherical coordinates (as we 
already mentioned in Note 1 on page 4) 
isx=rsin@cos¢,y=rsin@sing, z= 
rcos@, with 0 € [0,7] and ¢ € [0,27]. 
Computing the Jacobian of the trans- 
formation, 


r? sin Odrdédd 
r?drdQ, (1.82) 


dzdydz = 


where 


dQ = sin dodo. (1.83) 


Then, the total integral over the solid 


angle is 
T 27 
fa = | dosino f dọ 
0 (0 
(1.84) 


= Ai 


so the total solid angle in three dimen- 
sions is 4m. Observe that we can also 
write dQ = dcos dọ, reabsorbing the 
minus sign from dcos 0 = — sin d0 into 
a change of the integration limits, so 
that 


fe-f. acoso [7 dọ. (1.85) 


However, we have stressed that this only holds for r Æ 0, since these 
manipulations become undefined at r = 0. To understand the behavior 
of V-v in r = 0, we use Gauss’s theorem (1.48) in a volume V given by 
a spherical ball of radius R. Its boundary is the sphere $7, and on the 
boundary v = ĉ/R?, while ds = R?dQf, where dQ is the infinitesimal 
solid angle.? Then, 


[ov = fax 
V 2 


(1.86) 


This shows that V -v cannot be zero everywhere. Rather, since V-v = 0 
for r Æ 0, but still its integral in d?x over any volume V is equal to 4r, 
we must have 


y. (=) = 4r 63) (x). (1.87) 


From this, we can obtain another very useful result. Using the expression 
(1.23) of the gradient in polar coordinates, we get 


1 r 
=) == 1. 
v(t)=-8 i 
Writing 
1 1 
() = #( 
r r 
r 
we see that eq. (1.87) implies that 
1 
V? = —41 6) (x). (1.90) 
r 
Replacing x by x — x’, for x’ generic, we then also have 
1 a —4r8®) (x — x’) 
Ix — x’| i (1.91) 


We will use this result in Section 4.1.2, when we will introduce the notion 
of Green’s function of the Laplacian. 


1.5 Fourier transform 


We next recall the definition and some basic properties of the Fourier 
transform. For a function of one spatial variable f(x), we define the 
Fourier transform f(k) ast? 


~ +00 4 
f(k) = | dz f(x)e . (1.92) 


=00 


Observe that, if f(x) is real, f* (k) = f(—k). The simplest way to invert 
this relation between f(x) and f(k) is to use the integral representation 
of the Dirac delta, eq. (1.75).1! Multiplying eq. (1.92) by e’*”, integrating 
over dk/(27), and changing the name of the integration variable to x’ in 
the right-hand side of eq. (1.92), we get 


Pd | ee Pa t dk T og! 
/ para f (k)e*** = I dk a. da! tie 
Zoo 2T > 2m gs 
TER +o dk 1 
= J. dx! f(z’) a = ik(x—2") 
= dx’ f (a’)d(a — 2’) 
= fa (1.93) 
Therefore, the inversion of eq. (1.92) ist? 
ee ee. aie 
flo) = f Eie (1.96) 


Another useful relation is obtained considering the convolution of two 
functions f(x) and g(x), defined by 


+oo 
F(x) = / dx’ f(a')g(a — a’). (1.97) 
Taking the Fourier transform we get!® 
F(k) = f(k)g(k). (1.98) 


Therefore, the Fourier transform of a convolution is equal to the product 
of the Fourier transforms. This is known as the convolution theorem. 

Equations (1.92), (1.96), and (1.98) are easily generalized to any num- 
ber of spatial dimensions. In particular, in three spatial dimensions, the 
Fourier transform is defined as 


f(k) = J Ëe f(x)je E, (1.99) 
and its inversion gives 
Ëk zo; 
f(x) = / ane Fiket: (1.100) 
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10 More precisely, one must restrict to a 
space of functions such that the manip- 
ulations below are well defined. This 
can be obtained for instance consider- 
ing f € L!(R), the space of functions 
whose absolute value is integrable over 


R. 


11 This was not the historical path. The 
theory of distributions, which puts the 
notion of Dirac delta on a sound math- 
ematical basis, was only developed in 
the first half of the 20th century, while 
the original work of Fourier dates back 
to 1822. 


12There are different conventions for 
the factors 27 in the definition of the 
Fourier transform. The one that we 
have used is, nowadays, the most com- 
mon in physics. Another common 
choice is to define 


Fo = Lf ae slayer 
= — x f(x)e : 
V2 Joo 
(1.94) 
in which case eq. (1.96) becomes 


1 +00 P 
He) = val. eae 
(1.95) 


The explicit computation goes as fol- 
lows: 


F(k) = L dx F(x)e~*** 


co 


= D ax [ dx’ f(x’) 


x g(a = ae tl(e—2') +2] 
= J dx! f(a! )e~ th" 


0° F $ 
x l dz g(x — a! je ike ), 


—oco 


Introducing y = x — 2’, the last integral 
over dx at fixed x’ is the same as an 
integral over dy, so 


Pk) = [E pane 


+20 . 
x | dy g(y)e**Y 


—oo 


F(R)G(K). 
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14 Indeed, this function does not belong 
to LI(R), compare with Note 10 on 
page 13. 


If f(x) is real, f*(k) = f(—k). The convolution theorem now tells us 
that, if 


F(x) = J EL Fo)alx —x’), (1.101) 
then 7 7 
F(k) = f(k)g(k). (1.102) 


For a function of time f(t) we will denote the integration variable that 
enters in the Fourier transform by w, and we will also use a different 
sign convention, defining the Fourier transform f(w) as 


fw) = J dt fA, (1.103) 


so that 
f(t) = [Z io. (1.104) 


In the context of Special Relativity, the advantage of using a different 
sign convention between the spatial and temporal Fourier transforms is 
that the Fourier transform of a function f(t,x), with respect to both t 
and x, can be written as 


dwdk -~ ; 
f(t,x) = J oo! flake teks), (1.105) 
i.e., 
f(w,k) = [ata fie ER, (1.106) 


and, as we will see, the combination (wt — k-x) is more natural from the 
point of view of Special Relativity and Lorentz invariance. 


Example 1.2 In this example, we compute the three-dimensional Fourier 
transform of the function 
frm=-, (1.107) 


that appears in the Coulomb potential. A direct computation leads to 
a problem of convergence.!4 Indeed, from eqs. (1.99), (1.82), and (1.85), 


f(k) = parent 


oo 27 1 
1 ik 
f drr? f d | dcos@ — eros? (1.108) 
0 0 =1 r 


where, to perform the integral, we have written dx in polar coordinates 
with k as polar axis, so 0 is the angle between k and x. The integral over 
dọ just gives a 27 factor, and the integral over cos @ is also elementary, 


1 
2 
I dae Mrez be sin(kr) , (1.109) 


so we get 
f(k) = = du sinu, (1.110) 


where k = |k| and u = kr. The integral over du, however, does not 
converge at u = ov, and is not well defined without a prescription. We 
then start by computing first the Fourier transform of the function 


ek 


(1.111) 


where u > 0, and has the dimensions of the inverse of a length. This 
function, which corresponds to an interaction potential known as the 
Yukawa potential, reduces to the Coulomb potential as u — 0+. Per- 
forming the same passages as before gives 


n 4 oa 
V(k) = F du sinue" , (1.112) 
where e = u/k. For non-zero e, i.e., non-zero u, the factor e7% ensures 


the convergence of the integral. Writing sinu = (e™ — e~™)/(2i), the 


integral is elementary, 

co 
1 e tou 
t FE 0 


(i-e)u | 


V(k) = 


= e (1.113) 


Therefore, 

evr ~ 4T 
s> V(k) = 
r ( ) k? + u? 
Since the limit u > 0* of these expressions is well defined, we can now 
define the Fourier transform of the Coulomb potential as the limit for 


u — 0* of the Fourier transform of the Yukawa potential,’ so 


(1.114) 


Ar 


f(x) =t = fk)=7- (1.115) 


The correctness of this limiting procedure can be checked observing that, 
if we rather start from f(k) = 47/k? and compute the inverse Fourier 
transform, we get 1/r without the need of regulating any divergence. 
Indeed, in this case we get 


dk 4r 
fr) = o 


4 co 1 1 
= soon f mak f dcos 0 — tr cos 8 
873 0 =i k2 


ik-x 


(1.116) 


since k du (sin u)/u converges, and has the value 7/2. 
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15A note for the advanced reader. In 
quantum field theory, the Coulomb 
potential is understood as the in- 
teraction between two static charges 
mediated by the exchange of a massless 
particle, the photon. The exchange of 
a massive particle produces instead a 
Yukawa potential, with the constant 
pw related to the mass m of the ex- 
changed particle by u = mc/h, see 
e.g., Section 6.6 of Maggiore (2005). 
This way of regularizing the integral 
therefore corresponds, physically, to 
assigning a small mass mą to the 
photon and then taking the limit 
My — 0. Indeed, one way to put limits 
on the photon mass is to assume that 
the Coulomb interaction is replaced 
by a Yukawa potential, and obtain 
limits on m+. Currently, the strongest 
limit on the photon mass coming 
from a direct test of the Coulomb law 
gives myc? < 1x 10-'eV. Writing 
u = 1/ro, so that ro = h/(myc), this 
translate into a limit ro > 2 x 107m, 
i.e., there is no sign of an exponential 
decay corresponding to a Yukawa po- 
tential up to such scales. Other limits 
on the mass of the photon, not based on 
a direct measurement of the Coulomb 
law, are even stronger, with the most 
stringent being myc? < 1 x 10718 eV, 
corresponding to rg > 2 x 1011m, see 
https://pdg.1bl.gov/2021/listings/ 
contents_listings.html. Another 
way to search for deviations from 
Coulomb’s law is to look for a force 
proportional to 1/r2te, and set limits 
on e. This is a purely phenomenolog- 
ical parametrization and, contrary to 
the case of the Yukawa potential (and 
to statements in some textbook), it 
has no field-theoretical interpretation 
in terms of a photon mass. 
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16 Here, we are keeping the axes of the 
reference frame fixed, and rotate the 
vector by an angle @. Then vj, and v} in 
eqs. (1.117) and (1.118) are the compo- 
nents of the rotated vector with respect 
to these fixed axes. This is called the 
“active” point of view. Equivalently, we 
can keep the vector fixed and rotate the 
axes by an angle —6. Then, vi and v} 
in eqs. (1.117) and (1.118) are the com- 
ponents of the vector v with respect to 
the new system of axes. This is called 
the “passive” point of view. 


71h elementary physics, we are usu- 
ally interested only in rotations in two 
or in three spatial dimensions. The 
generalization to arbitrary dimensions, 
however, is already a useful preparation 
for the extension to more complicated 
transformations, such as Lorentz trans- 
formations. 


1.6 Tensors and rotations 


The elementary definition of a vector is based on the notion of an arrow, 
i.e., an object with a length and a direction. It is very useful to under- 
stand vectors in a more elaborate language, that of representations of 
the rotation group. This will allow us to better understand the meaning 
of other objects, such as tensors, and will be a useful preparation for 
understanding the formalism of Special Relativity, where the fundamen- 
tal quantities will be given by representations of a broader group, the 
Lorentz group, that we will introduce later. In this section, we will give 
a first, simpler treatment, while in Section 1.7 we will provide a more 
formal group-theoretical description (that will not be necessary for the 
rest of the book but gives a deeper understanding). 

Consider a vector v in two dimensions, defined as an arrow; with 
respect to a given system of Cartesian axes, it will have components 
(vz, Vy). Consider now a counterclockwise rotation of the vector by an 
angle 0 in the plane. After the rotation, the new components are given 
by!6 


(1.117) 
(1.118) 


Uz Cos A — vy sin, 


e 
II 


Uz sin 0 + Vy cos 0 . 


Therefore, under a rotation by an angle 0, the transformation of v is 


given by 
E v, \  { cos@ —sin0 Ur 
vw, J \ sinf — cosé a 


Using our notation with a sum over repeated indices, we can rewrite this 
as 


(1.119) 


Viz v; = Rijt; i (1.120) 


where the indices i, j take the two values {x,y} (or, equivalently, {1,2}) 
and Ri; = Ri;(@) are the matrix elements of the 2 x 2 matrix that 
appears in eq. (1.119). 

We can now promote eq. (1.119) to the basic relation that defines 
a vector, stating that a vector in two spatial dimensions is defined as 
a set of two numbers (its components) with a well-defined transforma- 
tion property under rotations, expressed by eq. (1.119). This definition 
totally abstracts from the original notion of an arrow, and has the ad- 
vantage that it can be generalized both to objects with different trans- 
formation properties and to arbitrary dimensions. 

Consider first the generalization to arbitrary dimensions.!’ We begin 
by observing that the rotation (1.119) preserves the length of the vec- 
tor, so the squared norm of the vector v (that, with our convention of 
summation of repeating indices, can be written as v;v;) is equal to the 
norm of the vector v’ obtained applying a rotation to v, 

(1.121) 


em 
ViVi = ViVi. 


We now use this relation to define rotations and vectors, in any dimen- 
sions, as follows. We consider a set of d objects v;, i = 1,...,d, and 
a linear transformation of the form (1.120), where now Rj; is a d x d 
matrix. We define rotations in d dimensions as the transformations Rij 
that preserve the norm of v,'® so that eq. (1.121) (with i = 1,...,d) 
holds. The set of objects v; that transforms as in eq. (1.120) when Rij 
is a rotation matrix are then called “vectors under rotations” or, more 
simply, vectors. 

This definition allows us to characterize rotations as follows. Plugging 
eq. (1.120) into eq. (1.121) we get 

ViVi = RijvjRikUk a (1.122) 

Using eq. (1.3) and renaming the dummy indices on the right-hand side 
as i > k, j —> i,k > j, this can be rewritten in the form 


Oj UiV; Rij Rinvjve 
= Ri RE ViV; . (1.123) 
Since this must hold for any vector v;, it follows that 
Rei Rez = big: (1.124) 


Recall that, if Rij are the matrix elements of the matrix R, the transpose 
matrix RT has matrix elements (RT);; = Rj;. Then eq. (1.129) reads 
RE Rkj = ð;ij or, in matrix form 

R? R=], (1.125) 


where I is the identity matrix. Therefore RT is the left inverse of R, 
and then it also holds!’ 


RR” =1. (1.126) 
In components, this reads 
Rig Rijk = Sig - (1.127) 
Therefore, rotation matrices satisfy 
RR” = R' R=], (1.128) 
or, in components, 
Re Re = Rik Rijk = dij - (1.129) 


Equation (1.128) is just the definition of orthogonal matrices. We have 
then found that rotations in d dimensions are given by d x d orthogonal 
matrices, and vectors are defined by the transformation property (1.120) 
under rotations. 

The transformation property of vectors can be generalized to objects 
with different transformation properties. In particular, in any dimension 
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18We will qualify this more precisely 
at the end of this section and in 
Section 1.7, where we will see that 
“proper” rotations are obtained factor- 
ing out some discrete parity symmetry. 


19This follows from the fact that, for 
a square matrix, the left and right in- 
verse are the same. Indeed, let A be 
a matrix, and suppose that its left in- 
verse, L, exists, so that LA = I. Then, 
multiplying by L from the right, we get 
LAL = L. Multiplying further by L~1 
from the left, we get AL = I. Then, L 
is also the right inverse. 
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This section is more formal and can 
(actually, should!) be omitted at first 
reading. 


d we can define a tensor T;; with two indices as an object that, under 
rotations, transforms as 


Tij > Tij = Rir RjiTki, (1.130) 


(with the indices running over two values {x, y} in two dimensions, over 
{x,y,z} in three dimensions, and more generally over d values 1,...,d 
in d dimensions), with the same matrix R,; that appears in the transfor- 
mation of vectors. Similarly, a tensor Tije with three indices is defined 
as an object that, under rotations, transforms as 

ijk = Ri Ryy Rew Tigh » (1.131) 
and so on. Note that, for tensors, the connection with the notion of arrow 
is completely lost, and a tensor is only defined by its transformation 
property (1.130). 

Consider now a tensor T;;j that, in a given reference frame, has compo- 
nents 6;;. Normally, the numerical values of the components of a tensor 
change if we perform a rotation, according to eq. (1.130). In this case, 
however, in the new frame 


Ti; = Riek Tra 
= RikRjiôki 
RikRjk 
Da (1.132) 


where in the last equality we have used eq. (1.129). Thus, 6;; is a very 
special tensor, whose components have the same numerical values in 
all frames. It is then called an invariant tensor. In three dimensions, 
the only other invariant tensor of the rotation group is €ijkķ, since it 
transforms as 

Eijk Ra Ryp Rew evy . (1.133) 


However, from the definition of the determinant of a 3 x 3 matrix, we 
can check that the right-hand side of eq. (1.133) is equal to (det R)eijx- 
Equation (1.128), together with the fact that, for two matrices A and 
B, det(AB) = det(A)det(B), and det(R™) = det(R), implies that 
(det R)? = 1, so det R = +1. “Proper” rotations are defined as the trans- 
formations with det R = +1 (while transformations with det R = —1 
correspond to parity transformations, that change the orientation of one 
axis, or of all three axes, possibly combined with proper rotations), so 
Eijk is indeed invariant under (proper) rotations. 


1.7 Groups and representations 


Vectors and tensors are usually the first examples that one encounters of 
a much more general concept, that of representations of groups, which is 
ubiquitous in modern theoretical physics. Even if this will not be strictly 
necessary for the rest of the book, it can be interesting to expand on 


the previous discussion, taking a more abstract and mathematical point 
of view. The underlying mathematics here is that of group theory and 
group representations. Group theory is a fundamental element of the 
mathematical arsenal of modern theoretical physics. While, especially 
with hindsight, it was already implicitly present in the classical physics 
of the 19th century, it acquired a central role first of all because of 
Special Relativity and its connection with electromagnetism (in relation, 
in particular, to the work of Lorentz and Poincaré), that we will explore 
in the following chapters. The role of group theory in physics then 
became even more central with the advent of quantum mechanics where, 
for instance, concepts such as the spin of the electron cannot be really 
understood without it. Modern particle physics, such as the Standard 
Model that unifies weak and electromagnetic interactions, as well as all 
attempts at going beyond it, are also formulated in the language of group 
theory.?° 

A group G is a set of objects g1, g2,... (discrete or continuous) among 
which is defined a composition operation gı © g2, such that 


e if gı € Gand g2 € G, then also gı © g2 € G. 

e The composition is associative, gı © (g2 © g3) = (g1 © gz) © gs. 

e In G there is the identity element e, defined by the fact that, for 
each g € G, goe=e0g=g."! 

1 


$ 


e For each g € G there is an inverse element, that we denote g7 
such that gog~! =g-!og =e. 

Note that the composition operation is not necessarily commutative. 
If it is, the group is called commutative (or abelian), otherwise is called 
non-commutative (or non-abelian). Rotations form a group (in any di- 
mensions), since they satisfy the previous axioms. 


Exercise 1.8 Show that rotations in two dimensions form a commu- 
tative group, while in three dimensions they form a non-commutative 
group. 

(Hint: take an object such as a book, lie it on a table, denote by x and 
y the axes on the plane of the table so that the lower edge of the book 
is along the x axis, and the rib of the book is along the y axis. Perform 
first a rotation of the book by 90° around the x axis and then a rotation 
of the resulting configuration by 90° around the y axis. Compare this 
with what happens if you first perform the rotation by 90° around the 
y axis and then by 90° around the x axis.] 


A (linear) representation R of a group is an operation that assigns to a 
generic, abstract element g of a group a linear operator Dr(g) defined 
on a (real or complex) vector space, 


g++ Dr(g), (1.134) 


with the property that, for all g1, g2 E€ G 


Dr(g1)Dr(92) = Dr(91 © 92) - (1.135) 


1.7 Groups and representations 19 


20For an introduction to group theory 
in physics, see e.g., Zee (2016). 


21 The identity element is unique: in 
fact, assume that e; and e2 are two 
identity elements. Then, using the fact 
that e2 is an identity element, we have 
e1 0e2 = e1. However, using the fact 
that e; is an identity element, we also 
have e1 0 e2 = e2. Therefore e1 = e2. 
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22 Apart from matrix representations, 
the other typical situation encoun- 
tered in physics is when a group el- 
ement is represented by a differential 
operator acting on a space of func- 
tions. Since the space of function 
is infinite-dimensional, this representa- 
tion is infinite-dimensional. 


This condition means that the mapping preserves the group structure, 
i.e., that it is the same to compose the elements at the group level (with 
the o operation) and then represent the resulting element gı © g2, or 
first represent gı and g2 in terms of linear operators and then com- 
pose the resulting linear operators. Note that the composition oper- 
ation in Dr(gi)Dr(gz) is the one between linear operators (such as 
the matrix product when the linear operators are matrices, see next), 
while the composition operation in gı © g2 is the one at the abstract 
group level. Setting gi = g, go = e in eq. (1.135), we find that, for all 
g € G, Dr(g)Dr(e) = Dr(g); similarly, setting gı = e, g2 = g, we get 
Dr(e)Dr(g) = Dr(g). This implies that the identity element of the 
group e must be mapped to the identity operator J, i.e., Dr(e) = I. 
Similarly, we can show that Dr(g~') = [Dr(g)|71. 

The vector space on which the operators Dp act is called the basis, 
or the base space, for the representation R and, as we have mentioned, 
it can be a real vector space or a complex vector space. We will be 
particularly interested in matrix representations. In this case, the base 
space is a vector space of finite (real or complex) dimension n, and an 
abstract group element g is represented by a n x n matrix [Dr(g)lij, 
with 7,7 =1,...,n. The dimension of the representation is defined as 
the dimension n of the base space. 

Writing a generic element of the base space as a vector v with com- 
ponents (v1,..., Un), a group element g can be associated with a linear 
operator [Dr(g)|i; acting on the base space, and therefore to a linear 
transformation of the base space 


vi > [Dr(g)liz2; , (1.136) 


with our usual summation convention on repeated indices. The impor- 
tant point is that eq. (1.136) allows us to attach a physical meaning 
to a group element: before introducing the concept of representation, 
a group element g is just an abstract mathematical object defined by 
its composition rules with the other group members. Choosing a spe- 
cific representation, instead, allows us to interpret g as a transformation 
acting on a vector space.?? 


1.7.1 Reducible and irreducible representations 


The different representations of a group describe all possible ways in 
which objects can transform under the action of the group, for instance 
under rotations. However, not all possible representations describe gen- 
uinely different possibilities. Given a representation R of a group G, 
whose basis is a space X, and another representation R’ of G, whose 
basis is a space X’, R and R’ are called equivalent if there is an isomor- 
phism S : X — X’ such that, for all g € G, 


Dr(g) = SDr (g)S. (1.137) 


Comparing with eq. (1.136), we see that, in the case of representations 
of finite dimension, equivalent representations correspond to a change 


of basis in the vector space spanned by the vê, obtained acting on the 
basis vector with a matrix S, so we do not consider them as genuinely 
different representations. 

Furthermore, some representations can be obtained trivially, just by 
stacking together representations of lower dimensions, and do not really 
describe novel ways of transforming under the action of the group. As a 
simple example, consider two vectors v and w, in two dimensions. Under 
a rotation in the plane by an angle 6, v transforms as in eq. (1.119), and 
similarly 


i és 
We 5 Wy _ cos 0 sin 0 We (1.138) 
Wy Wy sin cos@ Wy 


which expresses the fact that, in two dimensions, vectors are represen- 
tations of dimension two of the rotation group. Naively, we might think 
that we can find a new type of representation of dimension four, by 
putting together the components of v and w into a single object with 
components (Vg, Uy, Wz, Wy). Under rotations, 


Uz vi, cos? —sinð 0 0 Uy 

Hy x v _ | sin@  cosð 0 0 Vy 

Wy w, > 0 0 cos@ —sind Wa 

Wy Wy 0 0 sin cos@ Wy 
(1.139) 


However, here we have not discovered a genuinely new type of repre- 
sentation of dimension four; we have simply stack together two vectors. 
Mathematically, the fact that this is not a genuinely new representation 
is revealed by the fact that the matrix in eq. (1.139) is block diagonal 
(for all rotations, so in this case for all values of 0), so that there is no 
rotation that mixes the components of v with that of w. 

This example motivates the definition of reducible and irreducible rep- 
resentations. A representation R is called reducible if it has an invariant 
subspace, i.e., if the action of any Dr(g) on the vectors in the subspace 
gives another vector of the subspace. This corresponds to the fact that 
[possibly after a suitable change of basis of the form (1.137)] there is 
a subset of components of the basis vector that never mixes with the 
others, for all transformations Dr(g). A representation R is called com- 
pletely reducible if, for all elements g, the matrices Dp(g) have a block 
diagonal form, or can be put in a block diagonal form with a change 
of basis corresponding to the equivalence relation (1.137). Conversely, 
a representation with no invariant subspace is called irreducible. Ir- 
reducible representations describe the genuinely different way in which 
physical quantities can transform under the group of transformations in 
questions. 

To put some flesh into these rather abstract notions, we now look at 
the simplest irreducible representations of the rotation group. 
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1.7.2 The rotation group and its irreducible tensor 
representations 


The group of rotations in d spatial dimensions can be defined as the 
group of linear transformations of a d-dimensional space with coordinate 
(@1,...@qa), of the form 


Ti > t; = Rij£j, (1.140) 
which leaves invariant the quadratic form 
aE TE (1.141) 


As we already found in eq. (1.128), the corresponding condition on the 
matrix R;; is that it must be an orthogonal matrix, RRT = RT R =I. 
The group of dxd orthogonal matrices is denoted by O(d). As we already 
found below eq. (1.133), from RT R = I it follows that (det R)? = 1, so 
det R = +1. The transformations with determinant +1 form a sub- 
group, since the product of two matrices with determinant plus +1 is 
still a matrix with determinant +1 (which is not the case for those with 
determinant —1, given that the product of two matrices with determi- 
nant —1 is a matrix with determinant +1). The subgroup of O(d) with 
determinant equal to +1 is denoted by SO(d) and is identified with 
the proper rotation group in d dimension. As already mentioned be- 
low eq. (1.133), transformations with det R = —1 rather correspond to 
discrete parity transformations, such as (x > —x,y > y,z — 2) or 
(a => —a,y > —y,z = —z) or, more generally, to parity transforma- 
tions combined with proper rotations. By “rotations” we will always 
mean the proper rotations. In particular, the rotation group in three 
dimensions is SO(3). 

The simplest irreducible representation of the rotation group, as of 
any other group, is the so-called trivial representation, that assign to 
each group element the 1 x 1 identity matrix, i.e., Dr(g) = 1 for each g. 
In this case, the basic properties (1.135) of a representation are obviously 
satisfied (note, however, that if we assign Dr(g) = I with I the n x n 
identity matrix with dimension n > 1, we do not get an irreducible 
representation, since the identity matrix is block diagonal). Despite 
being trivial from the mathematical point of view, this representation is 
very significant physically. According to eq. (1.136), the basis for this 
representation is an object of dimension 1, i.e., a single quantity ¢, that, 
under any rotation, remains invariant, ¢ —> @. Such quantities are called 
scalars under rotation. 

The next simplest representation of SO(3) is the vector representa- 
tion v;. Indeed, the very definition that we have given of the group 
SO(3), based on eq. (1.140) and the condition that it leaves invariant 
the quadratic form (1.141), already introduces a transformation prop- 
erty under rotation, in this case for the coordinates. More generally, 
vectors are then defined as objects that transform as in eq. (1.120), and 
the coordinates of three-dimensional space provide the simplest exam- 
ple of a vector, x = (£1, £2, £3) [or x = (x,y,z); we will use the two 


notations interchangeably]. Other obvious examples of vectors from el- 
ementary mechanics are the momentum p, or the angular momentum 
L. 

The vector representation can be taken to be real, since the rota- 
tion matrix Rij has only real elements, and therefore transform real 
vectors into real vectors. The real vector representation of SO(3) is ir- 
reducible, since we cannot find a subset of the coordinates (x1, £2, x3) 
that, whichever rotation we perform, never mix with the others. For 
instance, a rotation around the third axis mixes xı with x2, a rotation 
around the xı axis mixes xə with x3, and one around the x2 axis mixes 
xı with x3. Since the vector representation enters in the very defini- 
tion of the group SO(3), it is also called the defining, or fundamental 
representation of SO(3).?° 

Consider next the tensor representation (1.130). In three dimensions 
a tensor T;; has nine components, and eq. (1.130) states that these nine 
components transform linearly among them, and therefore form a basis 
for a representation of the rotation group of dimension nine. We could for 
instance express the corresponding representation of rotations as 9 x 9 
matrices, using as a basis (T\1,Ti2,...,733). We want to understand 
whether this representation is reducible, i.e., if (possibly after a suitable 
change of basis) there are subsets of elements of the basis that never 
mix with the other elements. The best way to address the problem is to 
observe that, given a generic tensor T;j, we can always separate it into 
its symmetric and antisymmetric parts 

Ty = Lad } 
2 2 
= Siz + Ajj : 


(1.142) 


Using eq. (1.130), we can show that, under any rotation, a symmetric 
tensor remains symmetric: 


Rix Rj Se 
Rip Ry Sik 
RjkRilSki 
= gf 


ji 


J = 
T= 


(1.143) 


where, to get the second equality, we used the symmetry of Sp, and, in 
the third, we renamed the dummy indices k + l and l > k. Similarly, an 
antisymmetric tensor remains antisymmetric. Therefore, S;; and A;; do 
not mix under rotations. Consider now the trace of the symmetric part, 
S = ði; S. Using eq. (1.130) and the property (1.129) of orthogonal 
matrices, we see that it is invariant under spatial rotations. It is then 
convenient to separate Sj; as 


1 1 


1 
= S+ 3015 (1.144) 


1.7 Groups and representations 23 


23 Observe that we are considering here 
real representations, i.e., representa- 
tions where the basis vectors are real. 
One can sometimes assemble real rep- 
resentations of SO(d) into complex 
representations of lower (complex) di- 
mension. For instance, for SO(2), 
eq. (1.119) defines an irreducible real 
representation of (real) dimension two, 
since the real quantities vz and vy mix 
among them. However, if we form the 
complex combinations v+ = vz + ivy, 
we see that v} — etv} and v- > 
e~'®y_ so, over complex numbers, we 
have two irreducible representations of 
complex dimension one. Indeed, a the- 
orem states that all irreducible complex 
representations of abelian groups, such 
as SO(2), are one-dimensional. 

For other groups, such as the groups 
U(N) of N x N unitary matrices, or 
SU(N) if we require the determinant 
to be +1, the matrix elements are com- 
plex, so, in general, one must consider 
complex representations. 
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24We are working for simplicity in three 
dimensions. In d dimensions, the trace- 
less combination is Sj; — (1/d)ôij S. 


The term Si is traceless (as can be shown contracting it with ô;; and 
using 040i; = Ou = 3).74 Since the trace is invariant, under rotations 
it remains traceless, and therefore it does not mix with the term 6,,5. 
We have therefore separated the nine components of a generic tensor 
Tij into its symmetric traceless part Si (which has five components, 
corresponding to the six independent components of a symmetric 3 x 3 
matrix, minus the condition of zero trace), its trace S (one component), 
and an antisymmetric tensor A;; which, in three dimensions, has three 
independent components, 


1 


The trace S = ÔT, ij, being invariant under rotations, corresponds to the 
scalar representation that we already encountered. The antisymmetric 
tensor might look like a new representation of the rotation group, but 
in fact, introducing i 

A; = 3 cisk Age (1.146) 
one sees that A; is just a spatial vector. We have inserted a factor of 
1/2 for convenience since, for instance, in this way A; = (1/2)e1jk Ajk = 
(1/2) (€123A23 + €132A32) = Ags [using the antisymmetry of Eijk and 
of Ajg with respect to (j,k)]. So, A23 = Ai, and similarly one finds 
Aı2 = A3, A31 = Ag, so the inversion of eq. (1.146) can be written 
compactly as 

Aij = Eijk Ak . (1.147) 


Therefore, the three independent components of an antisymmetric tensor 
can be rearranged into the three independent components of a vector. 
This means that these two representations are equivalent, in the sense 
of eq. (1.137), and we have not discovered a genuinely new irreducible 
representation of rotations. 

In contrast, the traceless symmetric tensor is irreducible (since there 
is no further symmetry that forbids its components to mix among them, 
and then we can always find rotations that mix a given components of Si 
with any other component). We have therefore found a new irreducible 
representation of the rotation group, of dimension five, the traceless 
symmetric tensors, and eq. (1.145) can be rewritten as 

Ti; = Si + ys + ijk Ak j (1.148) 
in terms of the invariant tensors ĝi; and €ijk, and of the irreducible 
representations provided by the scalar S, the vector A;, and the traceless 
symmetric tensor Si. Note how the nine independent components of 
T,; have been shared between a scalar (one component), a vector (three 
components), and a traceless symmetric tensor (five components). One 
can work out similarly the decomposition into irreducible representations 
of tensors with more indices, such as Tijk. 

Observe that the dimension of the scalar, vector, and traceless sym- 
metric tensor representations can all be written as 2s + 1 for s = 0 


(scalar), s = 1 (vector) and s = 2 (traceless symmetric tensor). Higher- 
order tensorial irreducible representations, obtained from tensors with 
more indices, give representations of dimension 2s + 1 for all other in- 
teger values of s. The representations of the rotation group play an 
important role also in quantum mechanics, where the (tensorial) repre- 
sentations that we have found turn out to correspond to massive particles 
with integer spin s (in units of ñ). 

Finally, it is instructive to consider infinitesimal rotations, and see 
how they can be parametrized. A rotation matrix that differs from the 
identity transformation by an infinitesimal quantity can be written as 


Rij = bij + wij + O(w?), (1.149) 


with w,,; infinitesimal parameters, which describe the deviation from the 
identity transformation. Requiring that this satisfies eq. (1.129), to first 
order in w we get 


Õij = [Sin + Wik + O(w?)| [ők + Wik t O(w*)| 
= bij + wij tw + Ow"), (1.150) 


and therefore, requiring that the linear order cancels, we get?° 
Wij = Wij , (1.151) 


i.e., wij is an antisymmetric tensor. An antisymmetric tensor wij can 
be written in terms of a vector 6 as in eq. (1.147), i.e., we can write 
Wij = —€ijh00% (where the minus sign is a convention, chosen so that 50° 
corresponds to counterclockwise rotations, and the use of the notation 
60; stresses that this is an infinitesimal angle), so 


Rij = Dij = ij KOO . (1.152) 


Then, the infinitesimal form of eq. (1.140) is 


rt > r Hw” 


= at — Ry I50F , (1.153) 


As a check, we can consider a rotation around the z axis, so that 60! = 
60? = 0 and 60° = 60. Then, eq. (1.153) gives x > x — (d0)y, y > 
y+ (60)x, z > z, which is the infinitesimal form of the counterclockwise 
rotation around the z axis, x + x cos 0 — ysin 0, y > x sin 0 + y cos 0. 
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25 Requiring the cancelation also to 
quadratic and higher order does not 
give any further constraint. This is a 
general property of Lie groups, (i.e., 
group parametrized in a continuous and 
differentiable manner by a set of param- 
eters), see e.g., Section 2.1 of Maggiore 
(2005). 


Systems of units 


As discussed in the Preface, there are two common systems of units, SI 
and Gaussian, each one with its own advantages. We begin by defining 
them and discussing their relation, which involves some subtleties. The 
SI system is the most widely used in most contexts but an acquaintance 
with the Gaussian system can also be very useful, in particular to prepare 
the transition from classical electrodynamics to quantum field theory, 
where (for reasons that we will explain) nowadays only Gaussian units 
are used. Therefore, while we will use SI units in this book, we will 
explain how to quickly translate the results into Gaussian units and, in 
App. A, we will collect some of the most important formulas, written in 
Gaussian units. 


2.1 The SI system 


Since its establishment in 1960, the International System of Units, or 
SI (from the French Systéme International), has imposed itself as the 
international standard. The SI is based on seven basic units from which 
all other units can be derived. For our course, we will really only need 
four of them: the meter (the unit of length, m), the kilogram (the unit 
of mass, kg), the second (the unit of time, s), that together form the 
m.k.s. system used in classical mechanics, and the ampere (the unit of 
electric current, A).1*? 

Over the years, the general trend in defining the basic SI units has 
been to get rid of definitions involving any man-made standard, or prop- 
erties of macroscopic materials, or descriptions of measurements, and 
rather define them using fundamental constants of Nature, or quantum 
properties of matter at the atomic level. For instance, the meter was 
originally defined in 1793 as one ten-millionth of the distance from the 
equator to the North Pole along a great circle and, after other redefi- 
nitions, was eventually redefined in 1899 as the distance between two 
lines marked on a prototype meter bar made of an alloy of platinum 
and iridium and conserved in the International Bureau of Weights and 
Measures in Sèvres, France. Such a definition had obvious intrinsic prob- 
lems of reproducibility (national meter prototype had to be fabricated 
and distributed), stability with time, and made reference to specific ex- 
perimental conditions (e.g., the distance between the lines had to be 
measured at the melting point of ice). Similarly, the ancient definition 
of the second was based on the Earth’s rotation period, so that 1 day = 
24 hr = 24 x 3600 s. The modern definitions of the units of time and of 
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lTwo more basic units are the kelvin 
(unit of temperature, K) and the 
mole (amount of substance, mol). 
The SI includes a seventh base unit, 
the candela (cd), related to lumi- 
nous intensity as perceived by the 
human eye, mostly of interest for 
biology and physiology. See https: 
//wow.bipm.org/utils/common/pdf/ 
si-brochure/SI-Brochure-9-EN. pdf 
for a detailed description. 


2Note that, while the correct spelling of 
the last name of André-Marie Ampére 
(1775-1836) involves an accent, in En- 
glish the unit of measure “ampere” is 
written without the accent. So, we 
will refer to Ampére’s law, but the cur- 
rent unit is the ampere. Note also 
that units derived from names of per- 
sons are written with lower cases, as 
in ampere, coulomb, or kelvin (except 
when grammatical rules require an up- 
per case letter), while their symbols are 
written in upper case, A for ampere, C 
for coulomb, K for kelvin. 
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3The exact definition is that the un- 
perturbed ground state hyperfine tran- 
sition frequency of an atom of Cs 133 is 
Av = 9192631770 Hz, where 1 Hz = 
gs 


Similarly, the kelvin (K) and the mole 
(mol), are defined by fixing, respec- 
tively, the Boltzmann constant 


kp = 1.380649 x 10773 JKT71, (2.2) 
and the Avogadro number 


Na = 6.022 14076 x 1073 mol7!. 
(2.3) 


length are different. The second is defined in terms of the frequency of a 
specific atomic transition.? Given this definition of a second, the meter 
is now defined in term of the speed of light c, stating that, by definition, 


c = 299 792458 m/s. (2.1) 


This definition is consistent with the previous definition of the meter 
based on a reference bar within the experimental error, but get rids of 
any reference to specific macroscopic objects or detailed measurement 
processes. In the same spirit, the kilogram is now no longer defined 
in terms of a reference object, but rather from the Planck constant h, 
stating that, by definition, h = 6.62607015 x 107°4 Js, where the unit 
of energy, the Joule, is related to the basic units by J = kg m? s7?. Note 
that, in this way, A and c are used to define the units and therefore their 
values are fixed by definition, and have no experimental error associated 
with them.* 

For electromagnetic phenomena, the SI system proceeds by defining a 
base unit for electric current, the ampere (A). Since a current is a charge 
flowing per unit time, this induces the definition of a derived SI unit of 
charge, the coulomb (C), from 1 A = 1 C/s. In 2019, the SI definition 
of the ampere (that, as we discuss in more detail next, was previously 
based on the force between two parallel wires carrying a current) has 
been changed, in order to relate it to fundamental constants rather than 
to experimental settings: now the ampere is defined from 1 A = 1 C/s, 
and the coulomb is defined in terms of the electron charge —e, stating 
that, by definition, 


e = 1.602 176634 x 10°'°C. (2.4) 


To understand the reason underlying this definition and the relation 
with the earlier definition, let us look at the force exerted between static 
charges, and at the force between parallel wires carrying currents. It is 
an experimental fact that two static charges, at a distance r, attract or 
repel each other with a force inversely proportional to the square of the 
distance. Even before having defined the units of electric charge, we also 
know that the force is proportional to the charge qı of the first body 
and to the charge q2 of the second body, as could be shown comparing 
systems with a different number of individual electron charges. This 
gives Coulomb’s law, 

F = pare. (2.5) 


The value of the proportionality constant k depends, of course, on the 
units chosen for the electric charge or, vice versa, could be used to define 
the unit of electric charge. In the SI system, the constant k is denoted 
as 1/(47€9), so 


1 ag, 
= = P. 2.6 

Ateg r2 2-0) 
The constant €o is called the vacuum electric permittivity, or just the 


vacuum permittivity. 


Similarly, if we take two parallel wires (approximated as infinitely long 
and of negligible thickness) carrying currents J; and Ip and separated 
by a distance d, the observation shows that they attract or repel each 
other with a force per unit length, dF'/dé, proportional to I, Ip/d. The 
proportionality constant in the SI system is called ~o/(27), so, for the 
modulus, we have ji ii 

Ho 4142 
uU on a (2.7) 
The force is attractive if the currents are parallel, and repulsive if they 
are antiparallel (we will see how eqs. (2.6) and (2.7) follow from Maxwell’s 
equations in Sections 4.1.1 and 4.2.3, respectively). The constant pọ is 
called the vacuum magnetic permeability. As we will see, Maxwell’s 
equations tell us that 4o and €o are not independent, but are related by 
the exact relation 1 


€oHo = “2? (2.8) 
Cc 


where c is the speed of light. In eq. (2.6) the value of co depends on 
the choice of units of the electric charge and in eq. (2.7) the value of jo 
depends on the choice of units of the electric current. However, since 
€o and po are related by eq. (2.8), fixing one of the two units fixes the 
other. 

The definition of the ampere before 2019 was obtained from eq. (2.7), 
by stating that, if we take two long (and infinitesimally thin) parallel 
wires each carrying a current of 1 A and at a distance of 1 m, the force 
per unit length in eq. (2.7) is equal to 2 x 107-7N/m, where the Newton 
(N) is the derived SI unit of force. The reasons for this numerical value 
are historical, but the orders of magnitude are such that, for two wires 
carrying a current of 1 A (and at a distance d of order, say, of a few 
centimeters, rather than one meter), this force could be measured in a 
laboratory with rather simple techniques. This definition of the ampere 
(and therefore of the coulomb) amounts to fixing by definition po in 
eq. (2.7) to the value 


zN 
uo = 4r x 10 ce (2.9) 


so uo has no observational error. Equation (2.8), together with eq. (2.1), 
then fixes €o, 


1 o 2 9 N m? 
m (2.99792458)? x 10° -53 (2.10) 
N 2 
= (8.988...) x 10° ae (2.11) 
or? o 
eo œ (8.854...) x 107” nai (2.13) 


again with no observational error, since both uo and c are exact numbers. 
In contrast, the electron charge becomes a measurable quantity. For 
instance, we could in principle use eq. (2.6) to measure the electric force 
between two electrons at a given distance and, since €g has been fixed, 
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In terms of the fundamental SI units, 


c2 


Nm2 _ 


A? st 


kgm3 ` 


(2.12) 
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Snttps://pdg.1bl. 
gov/2018/reviews/ 
rpp2018-rev-phys-constants.pdf. 


TAs we have seen, the fourth funda- 
mental base unit of the SI has been 
chosen as the ampere rather than the 
coulomb. It is sometime natural to use 
the coulomb as the fourth base unit, as 
we occasionally do here and in other 
places in the book. 


we would get a measurement of the electron charge. In practice, the 
most accurate measurement of the electron charge are obtained with 
other methods, such as the quantum Hall effects. In any case, the point 
is that, with this pre-2019 definition of the ampere, the electron charge 
was a measurable quantity with an observational error. For instance, the 
2018 edition of the Review of Particle Physics, that contains a standard 
compilation of physical quantities and of elementary particle properties, 
quoted the value e = 1.602 176 6208(98) x 1071? C. 

In 2019, the official SI definition of the ampere was changed and now, 
in the spirit of using only fundamental constants to define the units, 
it is based on eq. (2.4), which defines the coulomb in terms of e (and 
then the ampere from 1 A = 1 C/s). Therefore, now the value of the 
electron charge is fixed by definition and no longer has an observational 
error associated with it (just as the speed of light and the Planck con- 
stant). Having fixed the definition of the ampere in this way, we can 
now measure the force per unit length between two parallel wires at a 
given distance and carrying a current of, say, 1 A each. Then, from 
eq. (2.7) we now get a measurement of uo, so Ho becomes a measurable 
quantity with an observational error; its measured value, as of 2019, was 
consistent with the older definition po = 47 x 1077 N/A? to a relative 
standard uncertainty of 2.3 x 10~'°. Since eq. (2.8) remains exact, being 
a mathematical consequence of Maxwell’s equations, once measured po 
one also gets €o and the relative accuracy on €o is the same as that on 
Ho, Since c has no error. 

From eq. (2.4) we see that 1 C corresponds to a huge number of ele- 
mentary charges, of order 101°. This clearly shows how the SI units have 
their roots in the laboratory, since a huge number of elementary charges 
flowing per second is needed to produce a typical current observed in 
simple laboratory situations. 

The definition of electromagnetic units is completed by the definition 
of the electric and magnetic fields. These are defined through the Lorentz 
force equation that, in SI units, reads 


F=q(E+vxB). (2.14) 


Maxwell’s equations will be discussed in detail in Chapter 3, but for 
completeness we write them also here, in SI units: 


1 
VE = `p, (2.15) 
€0 
OE 
VxB = Hoj + Hoeoae > (2.16) 
VB = 0, (2.17) 
OB 
VxE = -=. (2.18) 


As it is clear from eq. (2.14), the SI dimensions of the electric field are 
N/C, or kgm/(s?C).” We also introduce, as a derived unit, the volt 
(V), defined as V = J/C (where J is the Joule). The volt is therefore 


an energy per unit charge, so is the unit used for potentials. From 
J = ke m?/s? it follows that, in SI units, the electric field has dimensions 
of V/m, 


kgm V 
E] = = 2.19 
[e| = SF =~ (2.19) 
Again from eq. (2.14), we see that in the SI the magnetic field has units 
of N/(Cms~!) = kg/(Cs). This quantity is called the tesla (T), and is 


the derived SI unit for magnetic field: 


kg 

B] = =~ =T. 2.20 
B = 54 (2.20) 
From eq. (2.15), together with eqs. (2.10) and (2.19), we see that p has 
dimensions C/m?, so is an electric charge per unit volume. The electric 
charge Q contained in a finite volume V is then given by 


— 3 op ë ş 
Q= fa p (2.21) 


From eq. (2.16), using the dimensions of io given in eq. (2.9), we find 
that j has dimensions C/(m?s) = A/m?, so j is a current per unit sur- 
face. The current dI flowing through an infinitesimal surface is obtained 
defining ds as a vector whose modulus ds is the area of the surface, and 
whose direction is equal to the normal of the surface (with a given choice 
of orientation), and is then given by dI = ds-j; so, the current J flowing 
through a finite surface S is given by 


I= | asi. (2.22) 


2.2 Gaussian units 


Gaussian units, first of all, are based on the c.g.s. system (centimeters- 
grams-seconds) rather than on the m.k.s. system. By itself, this would 
only lead to trivial conversion factors. The crucial difference, however, 
is in the definition of the electric charge. One starts again from eq. (2.5), 
but now one defines the unit of electric charge setting k = 1 by definition, 
so 

=f, (2.23) 


This is not just a rescaling of numbers compared to SI units. In the SI 
system, the ampere is a fourth independent base quantity, with respect 
to the units of lengths, time, and mass. We cannot express the ampere 
(and therefore also the coulomb) as a combination of positive or neg- 
ative powers of meters, seconds, and kilograms. As a result, in the SI 
the constant k = 1/(4r€o) is not a pure number, but has dimensions 
N m?/C?, see eq. (2.11) (or, re-expressing it in terms of the base units, 
it has dimensions kg mê s74 A~). In contrast, in the Gaussian system, 
k is taken to be a pure number without dimensions. The unit of charge 
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defined in this way is called the esu (for “electrostatic unit”) of charge 
or, equivalently, the statcoulomb (statC). Since, in Gaussian units, the 
force is measured in dyne, with 1dyne = lercms~?, from eq. (2.23) 
we see that, in Gaussian units, the unit of charge is a derived quantity, 
given by 

1statC = 1gr!/? cm9/? 57! (2.24) 


So, as far as electromagnetism is concerned, the SI system has four base 
units (kg, m, s, A) while the Gaussian system has only three (gr, cm, s). 

Comparing the Coulomb law in the Gaussian and SI systems, we see 
that the respective definitions of electric charges are related by 


qsI 
= ———.. 2.25 
a oe (2.25) 
Observe, once again, that the electric charge in the two systems have 
different dimensions, since €o is not a pure number. We can now consider 
a charge such that qsı = 1 C, and ask what the corresponding value of 
dgau is. Inserting the value (2.10) into eq. (2.25), we get 


V109 Nm? 
dea. = 2.99792458 x —— x 1€ 


= 2.99792458 x \/109 x 105dyne x 104cm?, (2.26) 
2.99792458 x 10° statC, (2.27) 


where we have used the conversion 1 N = 10° dyne and, from eq. (2.24), 
1dynecm? = 1 gr cm? s7? = 1statC?. Therefore, a charge of 1 C in the 
SI corresponds to a charge of 2.99792458 x 10° statC in the Gaussian 
system. In this sense, one might be tempted to write 


“1C = 2.997 92458 x 10° statC .” (2.28) 


We have put the equality between quotes because, written as an equality 
in this form, this is wrong. The relation between C and statC is not just 
a simple proportionality factor, as say, in the relation 1m = 10? cm. As 
we have stressed previously, in Gaussian units the statC is a derived 
unit, that can be expressed in terms of cm, gr, and s, while in the SI 
system, the ampere (and therefore the coulomb) is a fourth independent 
base unit, independent from m, kg, and s. If one would take eq. (2.28) 
literally, it would be possble to transform the statC into m.k.s. units, 
using, from eq. (2.24), 1statC = (10-%kg)!/? (107? m)3/? s~! and then 
eq. (2.28) would give the coulomb in terms of m, kg, and s. This would 
be wrong, since the coulomb is independent from m, kg, and s. The sense 
in which eq. (2.28) is correct is that, as we have written above, a charge 
of 1 C in the SI system corresponds to a charge of 2.99792458 x 10° statC 
in the Gaussian system, not that 1 C is equal to 2.99792458 x 10° statC. 
In other words, gsi and qgau are quantities with different dimensions, 
and C and statC also have different dimensions. However, qs1/C and 
dgau/statC are both pure numbers, and are related by 


(=) — 2.997 924.58 x 10° (ca) (2.29) 


which is the real meaning of eq. (2.28). 

Note also that, in the derivation of the relation expressed by eq. (2.28), 
we have taken eq. (2.10) as an exact relation, as in the pre-2021 definition 
of charge in the SI system. We can adapt this to the new SI definition 
of the electric charge by stating that the the coulomb is now defined by 
eq. (2.4) and that the relation (2.29) remains exact. This amounts to 
saying that, in Gaussian units, the electron charge is —egay, where 


egau = 1.602176 634 x 107° x 2.99792458 x 10° statC, (2.30) 


and this relation is exact. Numerically, this gives esau ~ 4.803... x 
10710 statC. Then, the constant k in eq. (2.5) becomes a measurable 
quantity, whose current value is consistent with k = 1 at the level of 10 
decimal figures. 

Another important difference between Gaussian and SI units is in the 
definition of E and B, which, in Gaussian units, are now defined writing 
the Lorentz force equation as 


F=4(E+Ž xB), (2.31) 


where, here, q = qgau, E = Egau and B = Beau, and c is the speed of 
light. Then, comparing eqs. (2.14) and (2.31), and using eq. (2.25), we 
see that the relation between the definitions of the electric and magnetic 
fields in the Gaussian system and in the SI system are 


Esau = V4meo Est, (2.32) 
i 

c Våre Beau = 4/ — Bsr. (2.33) 
Ho 


Beau 


Notice that, dimensionally, [Bsr] = [Eg1]/[v], where the brackets denotes 
the dimensions of a quantity, and v is a velocity. In contrast, Egau 
and Beau have the same dimensions. Observe that, taking into account 
eq. (2.25), (qE)sı = (qE)gau- In contrast eqs. (2.25) and (2.33) imply 
that (qB)sı = (¢B)eau/c. Equation (2.25) also implies jeau = jst/V47€0. 

Performing these replacements in eqs. (2.15)—(2.18), we get Maxwell’s 
equations in Gaussian units, 


VE = 4rp, (2.34) 
vere S a Ty (2.35) 
c Ot c 
VB = 0, (2.36) 
1 3B 
E+-— = 2.37 
VxE+ o OL 0, (2.37) 


where we did not write explicitly the subscript “gau” on E, B, p, and 
j. Observe that, formally, Maxwell’s equations and the Lorentz force in 
the SI system can be transformed into the corresponding equations of 
the Gaussian system with the replacements 

4T 


1 B 
man 2 = 2, 
eo> 7: lo > z> E-E B+, (2.38) 
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The final part of this section involves 
notions that go beyond the scope of a 
course of classical electrodynamics, and 
is not needed in the rest of the book. It 
can be skipped at first reading without 
loss of continuity. 


and without changing p and j. As we have seen, this does not corre- 
sponds to the actual conceptual relation between the two systems; elec- 
tric and magnetic fields in the two systems have different dimensions, and 
their correct relation is given by eqs. (2.32) and (2.33), and similarly the 
correct relation between electric charges is given by eq. (2.25). However, 
the formal replacement (2.38) is a useful trick for passing quickly from 
equations in the SI system to the corresponding equations in Gaussian 
units. 

The transition from classical to quantum electrodynamics is more nat- 
urally performed in Gaussian units, and Gaussian units are the only 
ones that are used nowadays in quantum electrodynamics and its gen- 
eralization to the Standard Model of particle physics. More precisely, 
in quantum field theory it is customary to use a slight modification of 
the Gaussian system, called rationalized Gaussian units (or Heaviside- 
Lorentz units), which differs from (unrationalized) Gaussian units just 
by the placing of some 4r factors. Denoting by the label “rat” the 
quantities in rationalized Gaussian units and, as before, by “gau” the 
quantities in (unrationalized) Gaussian units, the relations are 


drat = V 4r gau ; (2.39) 
1 


1 
Ea = Egau, Brat = = Beau - 2.40 
i Vir * ‘Van * peo 


Thus, in rationalized Gaussian units the Coulomb force reads 


1492 ; 
F= ; 2.41 
An?’ ( ) 
which is formally the same as eq. (2.6) with eo = 1. The Lorentz force 
still keeps the form (2.31), because the 47 factor in q cancels those in 
E, B, while Maxwell’s equations (2.34)—(2.37) become 


VE = p, (2.42) 
vxB- = j, (2.43) 
VB = 0, (2.44) 
vxE+S = (2.45) 


Similarly to eq. (2.38), eqs. (2.42)—(2.45) can be obtained from the 
Maxwell’s equations in SI units with the formal replacements 


E-E p=. (2.46) 
Cc 


c& > 1, Lo > Zz? 
c 
while leaving p and j unchanged. 
The underlying reason why (rationalized) Gaussian units are the stan- 
dard choice in the context of quantum field theory is that they provide 
a first step toward the definition of a system of units defined by set- 
ting h = c = 1, which is very convenient in the context of quantum 
field theory. We have seen that the Gaussian system is characterized by 


setting k = 1 in eq. (2.5), i.e., co = 1/(4r) in eq. (2.6) (exactly, as in 
the pre-2019 definition of the electric charge unit, or within the current 
experimental accuracy, with the modern definition based on the electron 
charge). Similarly, the rationalized Gaussian system is characterized by 
the choice co = 1. Here, the crucial point is not the specific numerical 
value chosen for €o, but rather the fact that €9 is declared “by law” to be 
a pure number, without dimensions. As a consequence, as we have seen, 
the unit of electric charge becomes a derived unit that can be expressed 
in terms of the units of mass, length, and time; then, the four indepen- 
dent base units of the SI system relevant for electrodynamics (m, kg, 
s, A) reduce to only three independent units in the Gaussian system, 
that can be taken as (cm, gr, s). One can push this logic further and, 
after having defined the unit of time from the frequency of an atomic 
transition (see Note 3 on page 28), one can define the unit of length 
in terms of the speed of light stating that, by definition, c = 1. Here, 
again, the crucial point is not so much the precise numerical value as- 
signed to c, but rather the fact that we declare the speed of light to be a 
dimensionless quantity. Thus, in this system of units, we have a further 
reduction of base units, and the unit of length is now the same as the 
unit of time, while any velocity is a dimensionless number. Of course, 
it is easy to go back to standard units; e.g., a velocity v = 0.9 in units 
c = 1 corresponds to v = 0.9 x 2.99792458 x 10/°cm/s in c.g.s. units. 

At this stage, we remain with only two base units, the unit of time 
and the unit of mass. One can now make a further step, and also set the 
reduced Planck constant A = 1, and now time becomes, dimensionally, 
the inverse of a mass. We are then left with a single base unit, that can 
be taken as the unit of mass. All other quantities are related to it in 
a simple way. Denoting by [M], [L], [T] the dimensions of mass, length, 
and time, respectively, we have [L] = [T] = [M]~1, while velocities are 
pure numbers, and therefore also linear momentum and energy have 
dimensions of mass. In these units, even the electric charge becomes 
a pure number. This can be seen observing that (having set eo = 1), 
q?/r has dimensions of an energy (it is the Coulomb energy of a system 
of two charges q1 = q2 = q); however, in units h = c = 1, 1/r has 
dimensions of mass, exactly as the energy, so q must be dimensionless. 
In quantum electrodynamics, an important role is played by the fine 
structure constant a. In rationalized Gaussian units (while still using 
“normal” c.g.s. units), it is defined as a = e?/(4rħc) (or by a = e?/(he) 
in unrationalized Gaussian units), and it can be seen, with standard 
dimensional analysis, that it is a pure number. Using the value of e 
given in eq. (2.30), one finds that its numerical value is a ~ 1/137. If we 
then set also h = c = 1, we see that the electron charge becomes a pure 
number in agreement with the previous argument, and we find that its 
numerical value is given by e?/(47) ~ 1/137.8 

This system of units, defined by setting €o to a pure number (whether 
exactly ceo = 1, or consistent with it within the current accuracy, given 
the definition (2.30) of the electron charge) and also A = c = 1, might 
look weird at first sight, but, in fact, it is so convenient in quantum 
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8More precisely, in a quantum field the- 
ory course one learns that the electron 
charge depends, logarithmically, on the 
energy scale at which it is probed (is a 
“running coupling constant,” in quan- 
tum fields theory jargon). The value 
a œ 1/137 is actually the value at low 
energies. 
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One can go even further and, when 
quantum gravity enters the game, one 
can also set Newton’s constant G = 1. 
At this point, all quantities become di- 
mensionless. These are called Planck 
units. 


mechanics and in quantum field theory that, in this context, these units 
are called natural units.? Of course, one has a significant loss in the 
possibility of spotting mistakes in the computations by making use of 
dimensional analysis, since there is a much smaller variety of basic units. 
However, this is a minor aspect, and it is largely compensated by the 
gain in simplicity and physical clarity of many formulas. This subject 
would bring us too far from the scope of this course, but the interested 
reader can find an extended discussion of h = c = 1 units, and on 
practical ways of using them, in Chapter 1 of Maggiore (2005). 


2.3 SI or Gaussian? 


After having defined the two systems, one can now try to tackle the 
question of which system is “better.” The answer, in fact, depends on 
the context. 

The Gaussian system is more natural from the point of view of Special 
Relativity. This is partly due to the definition of the magnetic field, 
where a factor of c is reabsorbed in B compared to the SI system, so 
that E and B have the same dimensions. This is much more natural from 
the point of view of Special Relativity since, as we will see in Chapter 8, 
in a relativistic formalism E and B enter on the same footing as the 
six components of an antisymmetric tensor F#”. As a result, many 
equations of electrodynamics, in particular in its relativistic formulation, 
look more elegant and more natural in Gaussian units. On the other 
hand, SI units are much more natural in all laboratory situations. 

The situation is quite similar to the option of using units h = c = 
1. In quantum field theory, this is the only natural choice, by now 
universally used, and all equations of relativistic quantum field theory 
look much cleaner and elegant without being “cluttered” by factors of h 
and c. Measuring speeds in units of the speed of light is the only natural 
option in particle physics (compare the statement “a particle has speed 
v = 0.99,” in units c = 1, with “a particle has speed v = 2.96795... x 
10°m/s”). However, outside the relativistic domain, measuring speeds 
in units c = 1 might be very weird (a speed limit at v = 60 km/h, in 
units c = 1 would become a limit at v = 5.559... x 1078). So, elegance 
of the fundamental equations is not the only criterion, and the attempt 
at defining a unique “best” choice, independent of the context, is futile. 

In a sense, one might argue that the Gaussian system stopped in the 
middle of a commendable path. The same logic that suggests to fix €o = 
1/(47) (or €o = 1) also suggests to fix c = 1 (and, in a quantum context, 
ħ = 1). These choices all have the effect of making the fundamental 
equations of the theory more transparent, getting rid of constants whose 
numerical value, in the SI system, is ultimately related to the human 
experience (e.g., the value of the speed of light in the SI system depends 
on the definition of the meter which, as we discussed, was originally 
defined as one ten-millionth of the distance from the equator to the 
North Pole along a great circle) and have nothing to do with Nature at 


its most fundamental level. 

So, one can see the Gaussian system (with c kept explicit) as an in- 
termediate choice between two cases: one might decide of not using at 
all the logic of reducing the number of independent units through the 
fundamental constants, and keep €o and c explicit, and define the units 
so that they are convenient in typical laboratory situations; this gives 
the SI system. Or else, one could decide to use this logic to the very 
end, setting not only 4reo = 1 (or eo = 1) but also c = 1 and, ina 
quantum context, A = 1. This leads to more transparent formulas at a 
fundamental level, in particular in a relativistic context and in quantum 
field theory. 

In conclusion, the best choice of units depends on the situation. In 
this book, we will use SI units because of the broader contexts in which 
they are used. Furthermore, it is trivial to pass from SI units to the 
rationalized Gaussian units with c = 1 used in quantum field theory: 
from eq. (2.46), we see that one can just make the formal replacements 
c —> 1, & —> 1, and uo — 1 in all SI equations so, in practice, one 
can just ignore them in all the SI equations. The inverse path, from 
(rationalized or unrationalized) Gaussian units with c = 1 to SI, requires 
instead some extra work, with dimensional analysis required to inserted 
the appropriate factors of c and €o (or yo). 

If one rather wants to translate equations from SI to unrationalized 
Gaussian units, furthermore keeping c explicit, one must take care of the 
placing of 4r factors and of the powers of c, and many relevant formulas 
are collected in App. A. 
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Maxwell’s equations 


We now enter in the heart of the subject, presenting Maxwell’s equations 
and beginning to extract their consequences. We assume that the reader 
already has an elementary knowledge of the basic phenomenology of 
electrostatic and magnetostatic phenomena, from a first introductory 
course of electromagnetism, and we will proceed in a more formal and 
advanced manner. We will first present Maxwell’s equations, both in 
the local and the integrated form. We will then show how they imply 
conservation laws for the electric charge and for energy and momentum 
and, in the process, we will identify the energy density and momentum 
density of the electromagnetic field. We will then introduce the gauge 
potentials. While, at first, their introduction might look just a trick 
for simplifying the equations, in fact we will gradually discover that 
the gauge potentials play a fundamental role at the conceptual level. 
Their introduction also brings in the notion of gauge invariance, of which 
electromagnetism is the prototype example, and which is fundamental 
to all modern theoretical physics. Contrary to most other textbook 
treatments, we will therefore introduce them at a very early stage of 
our presentation. Identifying the symmetries of the theory is another 
basic aspects of a modern approach, and we will then discuss some of 
the symmetries of Maxwell’s equations (leaving, however, the discovery 
of Lorentz symmetry and Special Relativity for Chapter 8, after we 
will have developed the necessary tools). In this chapter, we will keep 
a rather formal approach. In Chapter 4 we will then discuss several 
examples and applications to electrostatics and magnetostatics, both for 
their intrinsic importance, and to illustrate the more general concepts 
developed here in simple settings, making contact with more elementary 
treatments of classical electromagnetism. 


3.1 Maxwell’s equations in vector form 


3.1.1 Local form of Maxwell’s equations 


Maxwell’s equations are formulated in terms of the electric field E(t, x) 
and the magnetic field B(t,x). The field concept is among the most 
fundamental of modern physics, in particular in connection with Special 
Relativity. In classical mechanics, the dynamical variables describing a 
mechanical system are “generalized coordinates” of the form q(t), with 
i a discrete index corresponding to the degrees of freedom of the system; 
these could be, for instance, the three spatial components of the position 
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TA test charge is a charged particle 
which is taken to move in a given ex- 
ternal electromagnetic field. Physically, 
this means that we can neglect the 
back-action of the charge on the system 
that generates the external field. 


? With a common abuse of language, we 
will in general refer to eq. (3.6) as the 
“Lorentz force equation” even in a rela- 
tivistic context. A more accurate word- 
ing could be “Lorentz equation of mo- 
tion,” but we will not tamper with such 
a standard nomenclature. 


of a particle, {qz(t), dy(t), ¢-(t)}, or the 3N coordinates of a system of 
N particles. A field can be seen as a collection of dynamical variables 
labeled by a continuous spatial variable x, rather than by a discrete 
index i, so at each point of space we have a dynamical variable (or a set 
of dynamical variables, such as the three components of the electric field 
and the three components of the magnetic field). As we will see in due 
course, fields are the natural language for describing the electromagnetic 
interactions in a way consistent with the principles of Special Relativity. 
In classical electromagnetism, the dynamics of the electric and mag- 
netic fields is governed by Maxwell’s equations, that we have already 
presented in Chapter 2, but rewrite here. In SI units, they read 


VE = T> (Gauss’s law), (3.1) 
0 
: oE s 
VxB = pojt Hoco > (Ampère-Maxwell law), (3.2) 
V-B = 0, (3.3) 
OB 
VxE = Bp? (Faraday’s law). (3.4) 


As shown in Section 2.1, p(t,x) has dimensions of charge per unit vol- 
ume, so is an electric charge density, while j(¢,x) has the dimensions of 
a current per unit surface, and, in this sense, we will often refer to it as 
the current density. These equations allow us, in principle, to solve for 
E and B once assigned the “source terms” p and j (and the geometric 
setting and boundary conditions of the problems). The motion of a non- 
relativistic test charge,! with charge q and velocity v, in these fields is 
determined by the Lorentz force 


F=q(E+vxB), (3.5) 


where E and B are computed at the position of the particle. For a non- 
relativistic point particle, we have Newton’s law F = dp/dt, where p is 
the momentum of the particle, and therefore 


dp 

dt 
In a relativistic context, force, with its instantaneous character, is no 
longer a fundamental concept, while momentum still is, and we will see 
later that eq. (3.6) remains valid.? 

We will take eg and uo as two basic constants characterizing electro- 
magnetic phenomena. We have already met them in Section 2.1, and 
we will indeed show later that eqs. (2.6) and (2.7) are a consequence of 
eqs. (3.1)-(3.4). At this stage, it is also convenient to define a constant 
c from 1 


€oHo = = - 
c2 


q(E +v xB). (3.6) 


(3.7) 


As we saw in Section 2.1, €o has dimensions of C? /(N m?) while pọ has 
dimensions of Ns?/C?, so c has dimensions of a velocity. The use of 
the letter c obviously hints to the fact that this will turn out to be the 
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speed of light, although this will only emerge later, from the study of 
Maxwell’s equations. For the moment, we presume that we do not know 
this yet, and we treat it just as another constant, and we use it to rewrite 
Maxwell’s equations as 


VE > £ (3.8) 
€0 
1 OE . 
vVxB- ea ro (3.9) 
VB = 0, (3.10) 
OB 


where, if one wishes, any one of the three constants €o, Wo, and c can 
be completely eliminated from the equations, using eq. (3.7). This form 
of Maxwell’s equation is useful because, as we will see in Chapter 8, the 
structure of the terms on the left-hand sides, that involve only electric 
and magnetic fields, is dictated by Lorentz invariance, while the right- 
hand sides of eqs. (3.8) and (3.9) show how eo and po determine the 
coupling to the charge density and to the current density, respectively. 

Maxwell’s equations have gradually emerged in the 19th century as 
a means of describing and unifying a vast body of observations of elec- 
tric and magnetic phenomena and, for the moment, we will take them, 
together with the Lorentz force equation in the form (3.6), as the basic 
postulates that define the theory of classical electromagnetism. How- 
ever, after having developed the appropriate tools, we will see how they 
emerge very naturally (and, to a large extent, uniquely) from the re- 
quirement of constructing a theory of electric and magnetic phenomena 
which respects the principles of Special Relativity.? 

At first sight, the electric and magnetic fields might look just like 
mathematical constructions which are convenient as an intermediate 
step: the charge and current density determine E and B through Maxwell’s 
equations, and E and B determine the motion of particles, through the 
Lorentz force equation. As we work our way through classical electro- 
dynamics, we will see that, in fact, E and B (or, even more precisely, 
other fields, the gauge potentials, from which they can be derived, and 
that we will introduce shortly) are the truly fundamental dynamical 
variables for the description of electromagnetism. We will see that field 
configurations carry energy, momentum and angular momentum, just 
like particles, and can propagate as waves in vacuum and in materi- 
als. Eventually, the fundamental nature of these fields is most clearly 
revealed in the context of quantum field theory, where one discovers 
that, in fact, all particles are described in terms of fields, and the gauge 
potentials provide a description of a particle, the photon, which is the 
mediator of the electromagnetic interaction. Even if all these develop- 
ments have to wait for later chapters (and the quantum aspects belong 
to the domain of quantum field theory and will not be covered in this 
book), it is useful to already have in mind the fundamental nature of 
the field concept. 


3In this context, it is interesting 
to observe that Einstein’s 1905 pa- 
per on Special Relativity had the 
title “Zur Elektrodynamik bewegter 
Körper” (Annalen der Physik, 17:891, 
1905), whose English translation is 
“On the electrodynamics of mov- 
ing bodies.” The link between Spe- 
cial Relativity and electromagnetism 
was there from the start! An on- 
line edition of the English transla- 
tion of Einstein’s 1905 paper can be 
found at https://www.fourmilab.ch/ 
etexts/einstein/specrel/www/. 
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Historically, these integrated forms 
were typically the way in which the 
equations of electrodynamics were first 
discovered. Nowadays, considering 
electrodynamics as a classical field the- 
ory, the local form of the equations is 
more fundamental. We will develop the 
classical field theory approach in Sec- 
tion 8.7. 


3.1.2 Integrated form of Maxwell’s equations 


Using Gauss’s or Stokes’s theorems, we can obtain useful integrated 
forms of these equations.’ Integrating eq. (3.8) over a volume V bounded 
by a closed surface S = OV and using Gauss theorem (1.48), we get 


f ds- E(t, x) = ~avit), (3.12) 
S €0 


where Qy (t) is the electric charge inside the volume V, 


Ql = | aeo(t.x. (3.13) 
We can rewrite eq. (3.12) as 
belt) = —Qvit), (3.14) 
where 
p(t) = fds. B(t,x) (3.15) 


is the flux of E through the closed surface S = OV. Note that, since 
the coordinates x are restricted to be on the fixed boundary S' of V, 
and one integrates over this boundary, the integral on the left-hand side 
of eq. (3.15) is a function of the time coordinate only (and, implicitly, 
on the choice of volume V, and therefore on its boundary OV; however, 
to keep the notation lighter, we omit the label V in ®g). A similar 
integration of eq. (3.10) over the volume V gives 


fs: Bie» =0, (3.16) 
i.e., 
z(t) =0, (3.17) 
where 
p(t) = fas Blix (3.18) 


is the flux of the magnetic field through the closed surface S = OV. The 
flux of the magnetic field through any closed surface vanishes, because 
of the absence of magnetic charges. 

Integrating eq. (3.9) over a surface S with boundary C = OS and 
using Stokes’s theorem (1.38), we get the integrated form of the Ampère- 
Maxwell law, 


$ dl- B(t,x) = pol (t) + -o ; (3.19) 


where 


I(t)= fasie (3.20) 


is the current passing through the surface S [as we already saw in 
eq. (2.22)] and ®p is the electric flux passing through S. Note that 
the integral on the left-hand side of eq. (3.19) is the circulation of B 
around the closed curve C. Finally, integrating eq. (3.11) again over a 
closed surface S' with boundary C = OS and using Stokes’s theorem, we 
get the integrated form of Faraday’s law, 


_ d®z(t) 
fd BUt.x) =- rm 


where ®p is the magnetic flux passing through S.° The line integral 
of E along the curve C is called the electromotive force, or emf.® This 
result implies that, when the magnetic flux enclosed by a loop made by 
a conducting wire changes with time, a voltage is induced around the 
loop, and therefore a current appears in the wire, and in this form it 
was originally discovered by Faraday. The effect can be induced on a 
fixed loop by a time-dependent magnetic field but can also take place 
just moving the orientation of the loop with respect to a static magnetic 
field. We will discuss this in more detail in Section 4.3. 


(3.21) 


3.2 Conservation laws 


We next show that Maxwell’s equations imply a conservation law for 
the electric charge, as well as a conservation law for energy and mo- 
mentum. In the process, we will also be able to identify the energy 
and momentum carried by the electromagnetic field. In general, just 
as in classical mechanics, conservation laws are a consequence of the 
invariance of the system under some transformation. For instance, in 
classical mechanics energy conservation is a consequence of invariance 
under time translations and momentum conservation is a consequence 
of invariance under spatial translations. We will see in Section 8.7.3 how 
this relation between symmetries and conservation laws generalizes to 
classical field theory and, in particular, to the electromagnetic field. In 
this section we will rather see how these conservation laws emerge from 
simple manipulations of Maxwell’s equations. 


3.2.1 Conservation of the electric charge 


A first immediate consequence of Maxwell’s equations is a conservation 
law for the electric charge. Taking the time derivative of eq. (3.8) and 
combining it with the divergence of eq. (3.9) [and recalling that the 
divergence of a curl is zero, see eq. (1.18)], we get 
Op 


(3.22) 
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> Observe that, in eq. (3.17), ®g was 
the flux through a closed surface, while 
in eq. (3.21) it is the flux through a sur- 
face which is not closed, but rather has 
a boundary C. 


®The name is historical and not well 
chosen given that it is not a force, but 
rather the line integral of a force per 
unit charge, i.e., a potential; since E 
has dimensions of V/m, see eq. (2.19), 
the electromotive force is measured in 
volts. 
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TAs a historical note, the original 
Ampère law was simply V x B = poj, 
to be contrasted with the full Ampére— 
Maxwell law (3.2). If we take its di- 
vergence, we find V-j = 0, rather than 
the continuity equation (3.22). In 1862, 
Maxwell, with a heuristic mechanical 
reasoning, postulated the presence of 
the extra term pgegOE/Ot in eq. (3.2). 
Since the term «90E/0t formally adds 
up to j, it was identified with a form of 
current, and was called the “displace- 
ment current.” We now understand 
that this term rather belongs to the 
left-hand side of Maxwell’s equations, 
as we have written in eq. (3.9), being 
also present in vacuum, where there are 
no charges and currents, and it is cru- 
cial to obtain current conservation in 
the form (3.22). However, the idea of 
current conservation was not available 
to Maxwell, since the electron was yet 
to be discovered, and he did not as- 
sociate electric currents with charges 
in motion. See Zangwill (2013), Sec- 
tion 2.2.5, and references therein, for a 
historical discussion. 


8To obtain the current density, con- 
sider a beam of charged particles, with 
charge density p and velocity v, and 
take a surface of area dA transverse 
to v. In a time dt, the charges that 
pass through dA have filled a volume 
dV = dAxvdt. The electric charge that 
has gone through the surface is there- 
fore dQ = pdAvdt. The current dI flow- 
ing through the surface dA is then given 
by dI = dQ/dt = pvdA and the cur- 
rent density j, i.e., the current per unit 
surface, has modulus j = dI/dA = pv. 
Since, as a vector, it has the same di- 
rection of v, then j = pv. 


This is a continuity equation. Since it is a mathematical consequence 
of eqs. (3.8) and (3.9), we see that, in Maxwell’s equations, the charge 
density p and the current density j cannot be chosen arbitrarily but 
must respect eq. (3.22). The physical meaning of this equation can 
be understood better by integrating it over a finite volume V, with 
boundary S = OV, and using Gauss’s theorem (1.48). This gives 


a 
— dx p(t,x) =- | ds-j. 
dt Jy av 


On the left-hand side we have the total electric charge Qy inside the 
volume V, see eq. (3.13), so eq. (3.23) states that 


dQy (t) =- f ee: 
av 


(3.23) 


7 (3.24) 


i.e., that the variation of the charge inside a volume V is due to the flux of 
electric current going through its boundary, i.e., to the charges escaping 
the volume or entering it. If no charge escapes or enters, Qy is conserved. 
Inside the volume V, charges are neither created nor destroyed.” 

We will often consider the charge and current density generated by 
an ensemble of point-like particles. The charge density of a point-like 
particle with charge qa, on a trajectory x,(t), is 


palt, x) = gad [x = Xa(t)] , 


and its contribution to the current density is obtained multiplying this 
by its velocity va(t),® 


(3.25) 


jalt, x) = palt,x)va(t) (3.26) 
= qaVa(t)5® [x — xa(t)]. (3.27) 
Then, for a collection of charges labeled by an index a, 
p(t,.x) = $ palt,x), (3.28) 
jx) = (3.29) 


Sine 


We can check that these expressions indeed satisfy the continuity equa- 
tion (3.22), observing that 


Opa, A sy 
ge T og? -xl 
— dx’ (t) o c(3) 
~ œ H ari [x — xa (t)] 


(3.30) 


We will see in eqs. (13.35)—(13.37) how the expressions for the current 
density and the continuity equation generalize from point particles to 
fluids. 


3.2.2 Energy, momentum, and angular momentum 
of the electromagnetic field 


Energy density and energy flux 


We next show that, from Maxwell’s equations, we can obtain conserva- 
tion equations that allow us to associate energy and momentum to the 
electromagnetic field. To identify the expression for the energy of the 
electromagnetic field we take the scalar product of eq. (3.11) with B, 
and of eq. (3.9) with E. Subtracting them, we get 


1 1 
[B-(V x E) —E(V x B)] + a (3e + B?) =—mEj. (3.31) 


The term in brackets can be rewritten using? 
B.(V xE)-E(V x B) = V-(Ex B). (3.32) 
Therefore, dividing by po, eq. (3.31) becomes 


E2 + -1 B2 1 
A eE V-(ExB)-Ej, (3.33) 
Ot 2 Ho 


where we used eq. (3.7). Let us define the Poynting vector!® 


S= mE xB. (3.34) 


Integrating eq. (3.33) over a finite volume V with boundary ôV and 
using Gauss’s theorem, we get 


E? + u B? 
ff diy ee ËrEj= -— ds-S, 
t Jy 2 v av 


(3.35) 
which is known as Poynting’s theorem. In the absence of external charges, 
j = 0, this equation is already in the form of a conservation equation: on 
the left-hand side we have the time derivative of a quantity with the di- 
mensions of an energy, and on the right-hand side a flux coming out or in 
from the boundary of the volume. So, this already suggests that the en- 
ergy density of the electromagnetic field is given by (e9E? + Ho 1B?) /2 
and the energy flux is given by the Poynting vector. However, if we set 
j = 0, we could multiply both sides of eq. (3.35) by an arbitrary number 
a, so this argument at most tells us that the energy density is of the 
form a (e9E? + ug B?) /2 and the energy flux is aS, for some a. To 
confirm this interpretation as an energy conservation law, and to fix the 
normalization factor, we must better understand the term E-J. To this 
purpose, we consider a collection of non-relativistic point-like charges 
qa, a = 1,...,N inside the volume V, with positions x,(t) and veloc- 
ities v,(t). We could perform the computation for generic relativistic 
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The explicit steps are as follows: 
Bicijkðj Ex — Eicijkðj Br 
=  €ijk (BiO; Ex — Eið; Br) 
= —éijk (Br; Ei + Eið; Br) 
—0; (€ijrLiBr) 
+0; (€inj Li Br) 
= V(ExB), 


where, in the second equality, we have 
used the antisymmetry of €;;, to write 
€ijkBiOj Ek = —éijk Bk; Ei. 


10Named after John Henry Poynting, 
who derived this conservation equation 
in 1884. 
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particles (see next) but, to understand that this is indeed an energy 
conservation equation, and to fix the normalization factor a of the en- 
ergy density, it is sufficient to limit ourselves to non-relativistic particles. 
Using eq. (3.27) 


/ dx E(t, x) = Da dava(t) Blt, xa(t)], (3.36) 
V 


where the sum over a runs over all charges inside V. We now use the 
Lorentz force equation (3.6) that, for a non-relativistic particle located 
in x,(t), with velocity va(t), charge qa, and mass Ma, becomes 


dV a 
dt 


Multiplying both sides of this equation by vq(t) and using Va:(Va x B) = 
0, we get 


= qa {Elt, Xa(t)] + va(t) x Bit, xa(t)]} . (3.37) 


Ma 


daVa(t)-Eft,xa(t)] = mava(t)-—- 


dfl 4 
= dt (5mav2) . (3.38) 


Thus, 


d 1 
| 82e, x)-j(t,x) = i a goo 


d 
= “enn j (3.39) 


where Ekin is the total kinetic energy of the system. Therefore eq. (3.35) 
can be rewritten as 


E’ +u B°) d 
sf gae ta Oe I ds-S. (3.40) 


This shows that, indeed, the energy of the electromagnetic field in a 
volume V is 


oE? + u3 B? 
Eom = l: d°x a 3 (3.41) 
so eq. (3.40) can be written as 
dt TM F Exin) = -f ds‘S . (3.42) 
t OV 


The energy density of the electromagnetic fields, that we denote by 
u(t, x), can then be identified with the integrand in eq. (3.41), 


1 
= 5 («oB’ + m B?) , (3.43) 


(where, for notational simplicity, we did not write explicitly the argu- 
ments (t,x) in u, E and B). Equivalently, eliminating zg with eq. (3.7), 


u= 2 (EP +B?) . (3.44) 


2 


The Poynting vector (3.34) gives, instead, the energy flux across a sur- 
face. In terms of u and S, eq. (3.33) takes the form'! 


RE E T 


J: (3.47) 


When j = 0, as in a region of space where there are no charged particles, 
this is a local conservation equation of the same form as eq. (3.22) for 
the electric charge. However, for generic j, the right-hand side is non- 
vanishing. While the electric charge inside a volume can only change if 
charges flow in or out from the boundary, the energy of the electromag- 
netic field inside a volume can be exchanged with the mechanical energy 
of the particles, because of the work made by the electric field on the 
charged particles, so the integrated conservation law rather has the form 
(3.42), and the local conservation law has the form (3.47), with a term 
—E.j, in general non-vanishing, on the right-hand side. Also note that, 
as a by-product of this derivation, eq. (3.39) shows that 


Mi f dx K(t,x)-j(t, x) (3.48) 
Ve 


dt 
is the rate at which the electric field performs work on a system of 
charges and currents (while the magnetic field performs no work, since 
Va'(Va x B) = 0). 

Our use of the non-relativistic limit for the point-like charges was 
fully sufficient to understand that eq. (3.35) is an equation for energy 
conservation, and to fix the overall coefficient in the energy density of 
the electromagnetic field. However, it is not difficult, and instructive, to 
also perform the computation for fully relativistic particles.'? This can 
be done if we anticipate that the Lorentz force equation for a relativistic 
particle has the form (3.6), where, for a relativistic particle of mass mq 
and velocity va, the momentum is 

MaVa 


= ———, 3.49 
Pe T= 2) P 
while the energy is 
Ea = CV MÈZE + p2. (3.50) 
(We will prove these results in Sections 7.4.3 and 8.6.1.) Inserting 
eq. (3.49) into eq. (3.50) we can also write 
2 
gae a (3.51) 
v1- (va/2) 
Equation (3.49) can be inverted to give 
vy = Pa (3.52) 


Vme + pa 
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11 Observe that the energy density and 
the energy flux are not uniquely fixed 
by the integrated conservation equation 
(3.35). Indeed, defining 


u = u-V-v, 
S = 


(3.45) 
S +ô + Vxw, (3.46) 


where u and S are still given by 
eqs. (3.34) and (3.43), and v and w are 
arbitrary vector fields, the local conser- 
vation equation (3.47) is still satisfied. 
The vector field v reshuffles the rela- 
tive contributions of the integral over 
the volume V and that over the bound- 
ary OV, while w adds a term to the 
energy flux whose surface integral van- 
ishes, since its divergence vanishes (and 
vice versa, on a topologically trivial 
space any vector field with vanishing di- 
vergence can be written as Vxw, see 
the theorem for divergence-free fields 
on page 7). The expressions for the 
energy density and the energy flux ob- 
tained from eqs. (3.34) and (3.43), how- 
ever, also emerge naturally in a rela- 
tivistic formalism, as we will see in Sec- 
tion 8.4, where they become part of 
an energy-momentum tensor [although, 
in general, some freedom also exists in 
the definition of the energy-momentum 
tensor of a field, see Sections 3.2.1 and 
3.5.3 of Maggiore (2007)]. Eventually, 
at the theoretical level, the best ar- 
gument for the uniqueness of the ex- 
pressions (3.34) and (3.43) comes from 
General Relativity, where the expres- 
sion of the energy-momentum tensor is 
uniquely defined. So, for instance, the 
energy density u in eq. (3.43) is the 
quantity that couples to the gravita- 
tional field in the way required for a 
local energy density. 


12We follow the derivation in Garg 
(2012). 
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13A quantity proportional to E x B is 
a natural candidate for the momentum 
density of the electromagnetic field, 
since momentum density is a spatial 
vector, and E x B is the only vector 
that we can form with E and B. Fur- 
thermore, we will see in Section 3.4 that 
B is a pseudovector under parity, and 
then Ex B is a true vector under parity, 
just as the momentum density. 


Then, multiplying the Lorentz equation (3.6) by Va, we get 


v dpa 
“dt 
CPa dpa 
VET dt 
d 2 
= e Pa (3.53) 
mec? + p2 dt 


daVa' Elt, xa (t)] = 


On the other hand, from eq. (3.50), the term in the last equality is just 
the same as dEa/dt. Thus, 


| PrE(lt,x)jlt,x) = Y qava Eft, xa (t) 


dEkinrel. 
= —— .54 
=- (3.54) 
where Exinrel. = >>, Ea is the total relativistic kinetic energy of the 
particles, and eq. (3.40) holds in the form 
d E? + ug !B?)  dEkinre 
T Bar (eo Ho D kin.rel. L | daS. (3.55) 
dt Jy 2 dt av 


Momentum and momentum flux 


Beside energy, the electromagnetic field also carries momentum. To 
identify its expression, we can proceed similarly to what we have done for 
energy conservation, using Maxwell’s equations to obtain a conservation 
law in which the part depending on the sources can be identified with the 
time derivative of their mechanical momentum (just as, in eq. (3.40), we 
found a conservation equation in which a term was the time derivative 
of the mechanical energy of the sources). To this purpose, we consider 
the quantity? 

g(t, x) = co E(t, x) x B(t,x), (3.56) 


which is related to the Poynting vector (3.34) by 
1 
g(t,x) = alt, x)- (3.57) 


To prove that eq. (3.56) indeed represents the momentum of the elec- 
tromagnetic field, we take its time derivative, 


gi OE; OB 
I = €0 ijk (n, + 5, tt) : (3.58) 


ot ot ot 
Using Maxwell’s equations (3.9, 3.11) to compute OE; /Ot and OB;,/0t, 


we get 
Og: 


OL = eo L; (0; Ei = 0; E;) + eoc B; (0; B; = 0; B;) = (jxB); ; (3.59) 


where, as usual, 0; = 0/0x). We now use the identity 


1 
E;(0;E; -0;E;) = —0; (5% — BES) — E;V-E 
L- 1 
2 €0 
where, in the second line, we used Gauss’s law (3.8). Similarly, 
1 
B;(0;B; — 0:B;) = —0; (5% — Bs) ' (3.61) 
since, in this case V-B = 0. Then, eq. (3.58) becomes 
Ogi A 
a 7 2T — (PE +jxB);, (3.62) 
where 
1 25 aftas 
or, equivalently, 
1 25 yee | ee 
Tig = €0 3E ij — BE; |) + uo 5B ôg — BB; 
(3.64) 


This tensor (or, depending on conventions, its sign-reversed), is called 
the Maxwell stress tensor.\4 

Integrating eq. (3.62) over a volume V that includes all charges and 
currents, and using eq. (3.66), we get 


Poem . A 
dt i V OV 
where we have defined 
Pem(t) = f dz g(t, x) (3.66) 
V 
= © | dz (E x B)(t,x). (3.67) 
V 


We now observe that the Lorentz force exerted on an infinitesimal vol- 
ume d?z, which contains a charge distribution p(t, x) with a velocity field 
v(t, x), is obtained writing dq = pd?x in eq. (3.6), and p(t, x)v(t, x) = 
j(t,x) is the current density at the point x and time t, see Note 8 on 
page 44. Therefore, the Lorentz force equation on an extended distribu- 
tion of charges and currents can be written as 


dP mech 
dt 


z J dz (pE +jxB), (3.68) 
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l4There are different conventions on 
the sign used in the definition of the 
Maxwell stress tensor. For instance, 
Landau and Lifschits (1975) and Garg 
(2012) define the Maxwell stress ten- 
sor as the expression that appears in 
eq. (3.64), as we do, while in Jack- 
son (1998) and Griffiths (2017) it is 
defined as minus the tensor given in 
eq. (3.64). An advantage of using 
the tensor TŻÍ defined in eq. (3.63) or 
eq. (3.64) rather than its negative (in- 
dependently of which it is called the 
“Maxwell stress tensor”) is that, from 
eq. (3.62), Tij is the momentum flux, 
rather than its negative, so momentum 
conservation takes the same form as 
energy conservation, eq. (3.40). A re- 
lated advantage emerges in the context 
of the relativistic formulation of electro- 
dynamics: as we will see in Section 8.4, 
the energy density of the electromag- 
netic field will turn out to be the (00) 
component of a Lorentz tensor T#”, 
and T), as defined from eq. (3.63), will 
turn out to be equal to the u = i,v = j 
component of T”, rather than its neg- 
ative. 


50 Mazwell’s equations 


15The expression “covariant,” referred 
to an equation, means that the left- 
and right-hand sides of the equation 
transform in the same way under the 
given transformation. This is a gener- 
alization of the notion of invariance, in 
which the left- and right-hand sides do 
not change. For instance, an equation 
such as F = ma is covariant under ro- 
tations, since both the left- and right- 
hand sides transform as vectors under 
rotations. Therefore, if the equation 
holds in a reference frame, it also holds 
in a rotated frame. When we use the 
word “covariant” without further spec- 
ification, we will refer to the covariance 
under the transformation of the Lorentz 
group (spatial rotations and boosts), 
that we will introduce in Chapter 7. 


where Pyech is the mechanical momentum of the extended source dis- 
tribution. Therefore, eq. (3.65) can be written as 


d 
(Pen + Prnech); = 


di - ds hjTi; . 


(3.69) 


This has the required form of a conservation equation, and shows that 
Pom, defined in eq. (3.67), is indeed the momentum of the electromag- 
netic field (so g(t, x) is the momentum density), while 2;T;; is the flow 
of the i-th component of the momentum of the electromagnetic field 
through a surface normal to n. 

We will confirm this result in Section 8.4 with a covariant computation 
(which will first require the development of the covariant formulation of 
electrodynamics),!° and in Section 8.7.3, with a computation that will 
make use of the machinery of classical field theory and Noether’s theo- 
rem, which is more involved, but will make it clear that the conservation 
law (3.69) is related to the invariance under spatial translations, confirm- 
ing the interpretation of Pem as the momentum of the electromagnetic 
field. 

Equation (3.69) is valid in full generality, for a relativistic source. In 
the non-relativistic limit, dP mech /dt becomes the same as the mechanical 
force F exerted on a system of charges and currents, localized in a volume 
V. Therefore, eq. (3.69) can be written as 


1 f 3 OS; . 
F; = -z B x at - | asayn, (3.70) 
1 OS; 
3 i 
= -fa x E +0,T.| ; (3.71) 


where, in the second line, we have “undone” Gauss’s theorem to write 
back the surface integral as a volume integral. Therefore, the electro- 
magnetic field exerts a force per unit volume f; = dF;/dV, given by 

= 0j Tiz (3:72) 
Whenever 05; /0t = 0, the force in eq. (3.71) can be written as a surface 
integral. This happens in particular in electrostatics, where E = 0, or in 
magnetostatics, where B = 0, so in both cases, S = 0. We will discuss 
some applications of these results in Section 4.1.7. 

The previous result shows that the momentum density of the electro- 
magnetic field, g, is related to the energy flux, which is given by the 
Poynting vector S, by eq. (3.57). It is interesting to understand this 
relation in the following way. Consider a beam of relativistic particles 
propagating with velocity v along a given direction, all with energy E 
and momentum p. According to eqs. (3.49) and (3.51), 


vE 


p=—. (3.73) 
C 


Let dN = ndAdt be the number of particles of the beam that cross a 
transverse area dA in a time dt. Since each particle carries an energy €, 


the corresponding energy flux (energy per unit area and unit time) is nE 
while, since each particle carries a momentum p, the total momentum 
carried by these particles is pdN = pndAdt. On the other hand, given 
that their velocity is v, in a time dt they have filled a volume dV = 
dA x vdt so their momentum density is 

pndAdt pn 

dAvdt v ` 
Inserting p from eq. (3.73), we therefore find that the momentum density 
is nE/c?, so it is equal to 1/c? times the energy flux. So, eq. (3.57) is 
the relation that should be expected for a collection of particles. This 
may come as a surprise. In the classical treatment of electromagnetism, 
there is no notion of a particle associated with the momentum and the 
energy flux of the electromagnetic field. This is actually a hint of the 
fact that, at the quantum level, a notion of particle, the photon, will 
emerge. 


(3.74) 


Angular momentum 


One can similarly show that the electromagnetic field carries an angular 
momentum 


Jen = pee xxg(t,x), (3.75) 


where g(t, x) is the momentum density, i.e., xxg is the angular momen- 
tum density of the electromagnetic field.‘ Using the explicit expression 
(3.56), 


J= eo f ae xx(ExB). (3.76) 


Taking the time derivative and performing manipulations analogous to 
those performed previously for momentum conservation, gives angular 
momentum conservation in the form 


d 
RA (Jem + J mech); == ds n;Mi; 3 (3.77) 
dt t av 
where the flux of angular momentum is 
Mij = inte; , (3.78) 


(where T;,; is the Maxwell’s stree tensor), and 


dJ mech 
dt 


a d’xxx(pE +jxB). (3.79) 
V 


We will rederive this result explicitly in Section 8.7.3, using the for- 
malism of Noether’s theorem. The fact that the electromagnetic field 
carries energy, momentum, and angular momentum, just as a mechanical 
system, shows that, already at the classical level, E and B must be con- 
sidered as real physical entities (in the same sense in which we think of 
particles as real physical entities), and are not just useful mathematical 
constructions. 
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161), analogy with the notation used 
in quantum mechanics, for the an- 
gular momentum of the electromag- 
netic field we prefer to use the nota- 
tion Jem, rather than Lem. Indeed, 
as we will show in Section 8.7.3 using 
the formalism of classical field theory 
and Noether’s theorem, the expression 
given in eq. (3.75) can be rewritten as 
the sum of two terms that, at quantum 
level, correspond to the orbital angular 
momentum and to the spin part. We 
will then denote the mechanical angular 
momentum of the particles by Jmech- 
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17 See Maggiore (2005) for an introduc- 
tion to quantum field theory and quan- 
tum electrodynamics with a conceptual 
unity with this book. 


3.3 Gauge potentials and gauge invariance 


We will now rewrite Maxwell’s equations introducing a scalar potential 
ġ and a vector potential A, that we will call collectively the “gauge po- 
tentials.” At the level of classical electromagnetism, this might look at 
first just like a useful mathematical trick for rewriting the equations in 
a simpler form. However, gauge potentials are much more fundamen- 
tal. One realizes this, already at the level of classical electrodynamics, 
when expressing the theory in a Lagrangian formalism, as we will do in 
Section 8.7.2. There, one discovers that they play the role that “general- 
ized coordinates” have in the description of classical mechanical systems. 
The Lagrangian formalism is the starting point for the quantization of a 
field theory, so the gauge potentials also have the role of the basic fields 
in terms of which the quantization procedure is carried out. Classical 
electrodynamics is also the simplest example of a gauge theory, i.e., a 
theory built with gauge fields, of the type that we will introduce below, 
and with an invariance under “gauge transformations,” that again will 
be introduced next. The generalization of these concepts plays a crucial 
role in modern physics, particularly in particle physics and in condensed 
matter. Even if the extension to more general gauge theories, as well as 
all aspects related to quantization, go beyond the scope of this book, it 
is useful to be aware of them to already have a correct perspective.!” 

At the mathematical level Maxwell’s equations, in the form (3.8)— 
(3.11), are two vector equations (three components each) and two scalar 
equations, for a total of eight equations, for six fields F; and B;. It 
is therefore clear that there must be some degree of interdependence 
among them, otherwise in general they would not admit solutions. The 
introduction of the gauge potential will show, first of all, how to reduce 
them to four equations for the four fields (¢, A). Furthermore, we will 
see that, in terms of these variables, there is an extra symmetry (gauge 
symmetry), that allows us to further reduce the number of independent 
fields and equations. 

Let us begin with the Maxwell equation (3.10), V-B = 0. Because 
of the theorem for divergence-free fields given on page 7 (and taking 
into account that we work in RÌ, so that every surface in V can be 
continuously shrunk to a point), there exists a vector field A(t, x) such 
that 


B=VxA. (3.80) 


The field A is called the vector potential. Consider next Faraday’s law 
(3.11). In terms of A, it can be rewritten as 


V x (E+ >) =0. (3.81) 


Using now the theorem for curl-free fields, see again page 7, we conclude 
that there exists a function (t,x) such that 


E+—=-V64, (3.82) 
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(where the minus sign is a convention in the definition of ¢), and there- 
fore 


OA 


Ba -Vo“ (3.83) 


Thus, in terms of the scalar and vector potentials, the two Maxwell’s 
equations that do not depend on the sources, eqs. (3.10) and (3.11), are 
automatically satisfied, because of the identities V - (V x A) = 0 and 
V x (V¢) = 0. We can now insert eqs. (3.80) and (3.83) into the two 
remaining Maxwell’s equations, eqs. (3.8) and (3.9). This gives 


o p 
2 LIT AmaE 
V o+ a (V-A) a (3.84) 
and 
? 107A 10g\ _ 
VA-a92 V V-A+ ae = —HoJ, (3.85) 


where we have used eq. (1.21). Thus, Maxwell’s equations are completely 
equivalent to eqs. (3.84) and (3.85), i.e., to four equations for the four 
fields (¢,A). The solutions for E and B can then be obtained using 
the definitions (3.80) and (3.83). The electromagnetic field is therefore 
completely specified by four functions, the scalar field ¢ and the three 
components of the vector field A, rather than by six quantities (the three 
components of E and the three components of B). 

In fact, even this description in terms of gauge potentials is redundant. 
Indeed, consider the simultaneous transformation of ¢ and A given by 


A>A' = A-V9, 
na es ae (3.86) 


where @ is an arbitrary function.'® Since V x (V0) = 0, this transforma- 
tion does not affect B. Similarly, the transformation of @ is chosen so as 
to cancel the transformation of A in eq. (3.83), so E is also unchanged. 
The physical, observable, quantities are the electric and magnetic fields. 
The potentials (¢, A) and (¢’, A’) are therefore physically equivalent, 
since they describe the same electric and magnetic fields. The trans- 
formation (3.86) is called a gauge transformations. The fields E and B 
do not change under this transformation and are therefore examples of 
gauge-invariant quantities. Maxwell’s equations (3.8)—(3.11) are obvi- 
ously gauge invariant, since they depend only on E and B. 

Since 0 can be chosen arbitrarily without changing the physics, we 
can choose it so that eqs. (3.84) and (3.85), written in terms of the new 


18We always assume that functions 
such as 0(t,x) are continuous and in- 
finitely differentiable. In particular, 
subsequent derivatives applied to ¢, A, 
or 0 commute, e.g., 0;0;0 = 0;0;0. We 
will not repeat these conditions further 
in the following. 
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19- This gauge was first introduced by 
L. V. Lorenz in 1867. It is often mis- 
spelled as “Lorentz gauge” (with an ex- 
tra “t”), after H. A. Lorentz (the per- 
son after whom the Lorentz transfor- 
mations are named; in 1867, however, 
he was just 14 years old...). This “mis- 
print” has only been widely recognized 
in relatively recent times, thanks to 
Jackson and Okun (2001). 


gauge potentials (¢’, A’), are simpler. A convenient choice is obtained 
observing that, from eq. (3.86) 


r 13% _ 106) _ 
V-A +5 a (v A+ Jt 0, (3.87) 
where 22 
__! 2 
2 ap + (3.88) 


is called the d’Alembertian operator (or, colloquially, the “Box” opera- 
tor). The d’Alembertian is invertible (we will perform its inversion ex- 
plicitly in Section 10.1, using the method of Green’s functions). There- 
fore, an equation of the form 00 = f always admits solutions [with 
suitable boundary conditions at spatial infinity for f(t,x)] so, for any 
given value of the initial gauge potentials ọ and A, we can choose 0 
such that the left-hand side of eq. (3.87) vanishes. Omitting hereafter, 
for notational simplicity, the prime on the transformed gauge fields, we 
have reached the gauge 


106 _ 


T. E Pa ee 
v taa 


0. (3.89) 


This is called the Lorenz gauge. In this gauge, eqs. (3.84) and (3.85) 
become 


at 
ar. (3.90) 
and 
A=-poj, (3.91) 


and therefore have the form of wave equations, that we will study in 
detail in Chapter 9. 
Another convenient gauge choice is 


V-A=0, (3.92) 


which can always be reached because the Laplacian is invertible (again, 
assuming suitable boundary conditions at spatial infinity). This is called 
the Coulomb gauge. In this gauge, eqs. (3.84) and (3.85) become 


(3.93) 


and 


ee i 
A = — poj + a a . (3.94) 
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As we will discuss in Section 4.1.1, for static sources eq. (3.93) is just 
Poisson’s equation, the basic equation of electrostatics and, for a point- 
like charge, it gives rise to the Coulomb potential (this is the origin of 
the name of this choice of gauge). For a generic charge distribution, we 
will see in Section 4.1.2 how eq. (3.93) can be solved for ¢ by inverting 
the Laplacian. The solution for ¢ can then be inserted in eq. (3.94) to 
solve for A. Note, however, that in this gauge the equations for ¢ and 
for A are very different. The equation for ¢ involves a Laplacian, and 
the solution vanishes if the source term p = 0 (again, with vanishing 
boundary conditions at infinity). In contrast, the equation for A in- 
volves the d’Alembertian. As we will see in Chapter 9, even when the 
source term vanishes, j = 0, and ¢ = 0, this equation has non-vanishing 
solutions in the form of plane waves. 


3.4 Symmetries of Maxwell’s equations 


We will now examine some of the symmetries of Maxwell’s equations. 
First of all, Maxwell’s equations are obviously invariant under spatial or 
temporal translations. There is no preferred origin of time or of space 
in the equations, and time and space derivatives are invariant under 
translations, e.g., O/[O(t + to)] = /ðt, for any constant to. As we know 
from classical mechanics, in a mechanical system invariance under time 
translations implies energy conservation, and invariance under spatial 
translations implies momentum conservation. When we will develop a 
field theoretical approach in Section 8.7, we will see how this translates 
in the conservation equations for energy and momentum that we have 
found in Section 3.2. 

Maxwell’s equations are also clearly invariant under spatial rotations: 
in eqs. (3.8) and (3.10) both the left- and right-hand sides are scalars 
under rotations, so if the equations hold in a reference frame, they also 
hold in a rotated frame. Similarly, in eqs. (3.9) and (3.11) both the left- 
and right-hand sides are vectors under rotations so, again, they trans- 
form in the same manner under rotations and, if they hold in a reference 
frame, they hold in a rotated frame. Just as in classical mechanics, this 
will translate into the conservation of angular momentum. 

Another important symmetry of Maxwell’s equations is parity, which 
is related to reflection of the axes. We see by inspection that, if we 
transform the spatial coordinates as x — —x, while at the same time 
we transform the fields and the sources as 


E(t, x) > —E(t, —x), B(t,x) > B(t, -x), (3.95) 
and 
p(t, x) > p(t, —x), j(t,x) > —j(t, —x), (3.96) 


the transformed Maxwell’s equations, rewritten in terms of the trans- 
formed variable x’ = —x, are the same as the original ones. Note that 
a space-time event P, that has coordinates (t,x) in the original frame, 
has coordinates (t,—x) in the transformed frame. The change in the 
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arguments in the previous transformations therefore simply reflects this 
change of label of the given space-time point P. The non-trivial aspect 
is the sign in front of the various quantities: the transformation of E 
is that of a polar (or “normal,” or “true”) vector, while that of B is 
that of an azial vector (or “pseudovector”). Similarly, p(t, x) is a scalar 
under parity (while a pseudoscalar is defined as a quantity that, under 
rotations, is a scalar, but picks an overall minus sign under parity) and 
jis a polar vector. Finally, comparing the definitions (3.80) and (3.83) 
with eq. (3.95), we see that, under parity, ¢ is a scalar field (rather than 
a pseudoscalar) and A is a polar vector field. 

The fact that electrodynamics is invariant under parity might seem, 
naively, an obvious fact. Shouldn’t the law of physics be the same under 
a change of the orientation of the three axes, or under reflection in 
a mirror? This was, somewhat implicitly, the point of view until the 
1950s, when it was discovered that there is another interaction, the 
weak interaction, that, in fact, is not invariant under parity. So, the 
invariance of electromagnetism under parity is a non-trivial fact. 

Another symmetry of Maxwell’s equations is time-reversal t > —t. In 
this case, the equations are invariant if we transform 


E(t, x) > E(-t,x), B(t,x) > —B(-t,x), (3.97) 
while we transform the sources as 
p(t,x) > p(—t,x), j(t,x) > —j(—t,x). (3.98) 


These are the natural transformation properties under time reversal: a 
current is proportional to the velocity of the charges that produce it, so 
it changes sign if we reverse the direction of time. Magnetic fields, which 
are generated by currents, therefore must also change the sign. Electric 
charges and electric fields, instead, do not reverse the sign. Just as with 
parity, time-reversal is an invariance of electromagnetic interactions, but 
is violated by weak interactions. 

Actually, Maxwell’s equations have a much larger symmetry which is 
not readily apparent from the form (3.8)—(3.11) in which we have written 
them, which is based on the spatial vectors E and B. This symmetry is 
the covariance under Lorentz transformations, i.e., the transformations 
of Special Relativity. We will discover this symmetry in Chapter 8, after 
having developed a more convenient formalism in Chapter 7. 

Finally, in the absence of sources, i.e., when p = 0 and j = 0, Maxwell’s 
equations also have a duality symmetry 

E 


R348; Bo]. (3.99) 
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This symmetry is broken by the source terms. 


Elementary applications of 
Maxwell’s equations 


Historically, the mathematical descriptions of phenomena in the do- 
mains of electrostatics, magnetostatics, and electromagnetic induction 
provided the building blocks from which Maxwell’s equations were even- 
tually inferred, with a “bottom-up” approach in which equations valid 
in specific settings were generalized and unified. Such a bottom-up ap- 
proach, beside being important for providing a historical perspective, is 
also appropriate for a first elementary introduction to electromagnetism. 
In this more advanced course, we take instead a “top-down” approach 
in which, after having presented the full Maxwell’s equations in the pre- 
vious chapter, we systematically develop their consequences starting, in 
this chapter, with the most elementary applications. This, of course, 
does not respect the actual historical development, but has the advan- 
tage of allowing for a logically clear and streamlined presentation. In 
the main text of the chapter we discuss a selection of important results, 
while several other applications are discussed in a long Solved Problems 
section, at the end of the chapter. 


4.1 Electrostatics 


In terms of the electric field, the fundamental equations of electrostatics 

are obtained from eqs. (3.8) and (3.11), setting the time derivative in 
eq. (3.11) to zero, so they are 

V-E e (4.1) 

VxE 


II 


Il 
© 
~ 
A 
N 
s 


It is convenient to use eq. (4.2) to introduce the field ¢ from E = —V¢ 
(making use of the theorem for curl-free fields, see page 7), so that 
eq. (4.2) is automatically satisfied, and eq. (4.1) becomes Poisson’s equa- 
tion 


en 
rees (4.3) 


Of course, this could have also been obtained setting the time derivatives 
in eqs. (3.83) and (3.84) to zero. We will use eq. (4.3) as our basic 
equation to be solved in electrostatics. 
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lThis is an example of the superposi- 
tion principle. Since Poisson’s equa- 
tion (4.3) is linear, both in ¢ and in 
the source term p, if p is taken to be 
a sum of terms p = J}, fa, the so- 
lution for ¢ (that, as we will see, is 
unique) is ¢ = J „a, where ġa is 
the solution when the source term is 
pa- This linearity is not just a prop- 
erty of electrostatics but extends to the 
full Maxwell’s equation. As we see from 
eqs. (3.8)—(3.11), Maxwell’s equations 
are linear, both in the fields E, B, and 
in the sources p,j. As a consequence, 
suppose that, when the charge and cur- 
rent densities are given by some func- 
tions p1,ji1, Maxwell’s equations have 
a solution E1, Bı (we omit the argu- 
ments (t,x) for notational simplicity) 
and that, when the sources are given 
by p2,j2, they have a solution E2, B2. 
Then, from the linearity of the equa- 
tions it follows that, when the charge 
and current densities are given by p = 
pitp2 and j = jı +j2, Maxwell’s equa- 
tions have a solution E = E; + Eo, 
B = B; + B2. 


4.1.1 Coulomb’s law 


As a first application we will show how, in electrostatics, Maxwell’s 
equations imply Coulomb’s law. In Section 4.1.2 we will solve eq. (4.3) 
for a generic distribution of charges. Here, we limit ourselves to a point- 
like charge q located at the origin, so that 


p(x) = q5) (x). (4.4) 
Then, eq. (4.3) becomes 
Vo = —26)(x). (4.5) 
€0 
A solution can be found immediately using eq. (1.90), that gives 
q 1 
(x = A (4.6) 
4reg r 


which, as expected, is the Coulomb potential of a point-like charge. This 
solution is unique once one imposes the boundary condition that @ van- 
ishes as r — oo (in Section 4.1.5 we will provide a proof of the uniqueness 
of the solution to electrostatic problems). Using the expression (1.23) of 
the gradient in polar coordinates, the electric field E = —V ¢ generated 


by a charge q at the origin is therefore 
sie 
~ Amey r? 


f. (4.7) 


From the Lorentz force equation (3.5), the force F2 exerted on a charge 
q2 located in x2 = r by a charge qı located at xı = 0 is F2 = q2E1, so 
1 ag., 
F2: = -i 
? Arneo r2 


(4.8) 


This confirms that the constant €o that appears in the Maxwell equation 
(3.1) is the same that appears in the Coulomb force (2.6). 

The result (4.6) is immediately generalized to the case where the 
source term is given by a set of N point particles with charges qa and 
position Xa, so that 


e aa (4.9) 
a=1 
ne V2o= ae S > da 5) (x xa), (4.10) 
€0 
whose solution is! 
1 Š qa 
Ox) = Tra >, koxi (4.11) 


4.1.2 Electric field from a generic static charge 
density 


We now consider an electric charge density p(x), independent of time and 
localized in space (i.e., non-vanishing only inside a finite volume), but 
otherwise generic, while we set the electric current j = 0. Note that this 
choice is consistent with the continuity equation (3.22). It is convenient 
to work in the Coulomb gauge (3.92), where Maxwell’s equations take the 
form (3.93, 3.94). Since p(x) is independent of time, also $(x), derived 
from eq. (3.93), is independent of time.” Then, eq. (3.94) becomes simply 
A =0. Non-vanishing solutions of this equation describe plane waves, 
and will be discussed in Chapter 9. Here, we are interested in the solution 
sourced by the static charge density, so we set A = 0, which is the trivial 
solution of OA = 0 (and is obviously consistent with the condition 
V-A =0 that defines the Coulomb gauge). 

An equation such as (3.93) can be conveniently solved with the method 
of Green’s functions. The Green’s function of the Laplacian operator, 
G(x,x’), is defined as the solution of the equation 


V2G(x,x’) = 62) (x — x’), (4.12) 


[where the subscript x in the Laplacian indicates that V2 acts on the 
x variable of G(x,x’)]. Since the right-hand side depends only on the 
difference x — x’, we can already anticipate that, in fact, G(x,x’) will 
only depend on x,x’ through the combination x — x’, and we will there- 
fore write it as G(x — x’). Once found a solution of eq. (4.12), the 
corresponding solution of eq. (3.93) is given by* 


(x) === | da! Gi- xp). (4.13) 
In fact, taking the Laplacian of both sides, 
V(x) = -E [da aE- xo) 
= -E fxi- 
= A, (4.14) 
€0 


The Green’s function method can be applied to more general linear 
differential equations, where a differential operator (here the Laplacian), 
acting on a function, must be equal to a given source term. Its advantage 
is that it allows us to separate the problem of solving a differential 
equation such as (3.93) into two steps. First, one searches for the solution 
of the Green’s function. This part of the problem is independent of the 
source term [here p(x)] and for several operators, such as the Laplacian, 
can be solved exactly. Then, one remains with the computation of the 
integral in (4.13). For a generic source term it will not be possible to 
perform it exactly, but the integral might be amenable to useful analytic 
approximations (or direct numerical evaluation). 
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2One could always add to the solution 
of the inhomogeneous equation V? = 
—p/eo, a solution of the homogeneous 
equation V? = 0, which could be 
taken to be time dependent. However, 
for a localized distribution of charges 
we impose the boundary condition that 
ġ = 0 as r — ov, and this fixes to zero 
the solution of the homogeneous equa- 
tion, see Section 4.1.5. 


3 Actually, setting for instance x’ = 0, 
we see that eq. (4.12) is invariant under 
rotations of the variable x around the 
origin (or, for x’ generic, is invariant 
under rotations of x—x’), so we can also 
anticipate that G(x — x’) (that, as we 
will see later, is unique) will actually be 
a function only of the modulus |x — x’|. 


“As long as the integral converges at 
infinity, which, as we will see below, is 
the case for a localized distribution of 
charges. 
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5For the Laplacian, the solution of 
eq. (3.93), with the boundary condition 
that ¢(x) — 0 as |x| — oo, is unique, 
as we will show in Section 4.1.5, and 
therefore also the Green’s function is 
unique. This will not be true for other 
operators such as the d’Alembertian de- 
fined in eq. (3.88) and, indeed, in Sec- 
tion 10.1 we will study the physics asso- 
ciated with different Green’s functions 
of the d’Alembertian. 


6 This can be shown as follows: 


1 1 ; 
iix- x| mee 5 
1 
= ~0;|\x — x’ |? 
2|x — x’|3 i | 
= 1 
~ Ox — x8 
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The first step for solving the differential equation (3.93) is therefore 
to find the Green’s function of the Laplacian.” This can be done using 
eq. (1.91), which shows that the Green’s function of the Laplacian is 


1 


Ar|x — x'| ` 


G(x — x’) = 


(4.15) 


As anticipated, this is actually a function only of |x — x’|. Then, from 
eq. (4.13), the general solution for the scalar potential generated by a 


localized charge density is 


o 1 3 P(X’) 
060) = re f dia POE. 


Observe that the condition that the charge density is localized, i.e., that 
the function p(x) has compact support, ensures the convergence of the 
integral at infinity. For a set of point-like particles, inserting eq. (4.9) 
into eq. (4.16), we recover eq. (4.11). In electrostatics, the electric field 
is obtained from 


(4.16) 


E(x) =—-V¢, (4.17) 
since A = 0 in eq. (3.83). Therefore, 
1 p(x’) 
E; = ee ; BA scr 
(s) TT fa 7 |x — x’| 
1 1 
Se ee d 1 1 0; ——— 
na | oe) |x — x’| 
1 3 Ti — Ti 
= Anco fa gz! ex) xp ; (4.18) 
where we used® F f 
a; a! = 4.19 
|x — x’| |x — x|’ ( ) 
Therefore, in vector form, 
1 3 x- x 
E(x) = Eneo fa x’ A(x) Ex . (4.20) 


Another useful variant of eq. (4.20) is obtained starting from the second 
line in eq. (4.18), and writing 


1 o 1 
E; er d 1 e a 
z ATEo J Ep Bn, |x — x’| 
1 o 1 
= d 1 PY 
Trea ws ag |x — x’| 


a o 
Arey Ox! |x — x’| 


(4.21) 


where, in the last line, we integrated by parts, and used the fact that 
p(x) is localized to discard the boundary term. Then, 


hajs- poo ZE (4.22) 


Areo x- x| 


We can then compute the force exerted between two distributions of 
charges, pı (x) and p2(x), that we take as localized in two non-overlapping 
volumes, denoted by V; and V2, respectively. The force dF, exerted on 
the infinitesimal charge 


dq, = Ëz p1 (x), (4.23) 


contained in the infinitesimal volume d?z, by the electric field E2(x) 
generated by a charge distribution p2(x), is given by 


dF, = dq, E2 (x) r (4.24) 
Therefore, integrating over the volume of the first charge, 
F, = J dx pı(x)E2(x). (4.25) 


Using eq. (4.20), this can be written as 


1 
AT eg 


x— x! 


Fi = / d°ad° x! p(x) p2(x’) (4.26) 


|x — x| 


Note that, since pı vanishes outside Vı and p2 vanishes outside V2, we 
could actually extend the integrations over dx and dz’ to all of space.’ 

The force F2 exerted on the current density p2 by the charge density p1 
is obtained exchanging 1 + 2 in eq. (4.26). Then, after also exchanging 
the names of the integration variables x + x’, we get 


x —x 
Fo = — | Prr —=- 4.2 

a= gg | Pade! ao 2 
and see that Fə = —F4, so the force satisfies Newton’s third law. If we 

apply this to two point charges, setting 
pi (x) = no”) (x) (4.28) 

and 

p2(x’) = q26) (x = r), (4.29) 


eq. (4.27) gives back, of course, Coulomb’s law (4.8). 

In Section 5.5, we will discuss how to obtain this force from a contin- 
uous generalization of the Coulomb potential, and the relation of such 
a potential to the energy stored in the electric field generated by p1(x) 
and p2(x). 
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"Note also that, thanks to the con- 
dition that the two charge densities 
are non overlapping, there is no prob- 
lem of convergence of the integral as 
|x — x’| > 0, since there is a minimum 
distance d between the localization re- 
gion of the two distributions, so that 
pi(x)p2(x’) = 0 if |x —x’| < d. 
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8 Compare, for instance, with the pro- 
totype mechanical example, where the 
positive work Wext = mgh made by an 
external agent to lift a mass m from 
z = 0 to z = h against the gravitational 
force F = —mgz is minus the work 
Werav = f F-dx = SÈ (mga) -(dz Z)= 
—mgh, made by the gravitational field. 


4.1.3 Scalar gauge potential and electrostatic 
potential 


In the elementary treatment of electrostatics, one starts from eq. (3.4) 
with a vanishing magnetic field, VxE = 0, and uses the theorem for 
curl-free vectors, see page 7, to introduce the electrostatic potential y(x) 
from E = —Vy. Comparing with eq. (3.83), we see that the scalar gauge 
potential (t,x) is the generalization of the electrostatic potential y(x) 
to the general setting of time-dependent electric and magnetic fields, 
and reduces to it when we can neglect all time dependences. We will 
then use the notation $(x) even in the context of electrostatics. Using 
eq. (1.37), the equation 


E(x) = —V¢(x), (4.30) 


can be integrated to give 
#20) — 6(0) =- | dx-B00), (4.31) 


where the line integral is carried out over an arbitrary path C connecting 
an arbitrary initial point xo to the point x. The integral in eq. (4.31) 
is independent of the path C that connects x9 and x: the difference 
between the integral computed on a path Cı and that on a path Co, 
both with endpoints in xp and x, is in fact the same as the integral over 
the closed loop C; — C2 (where we denote by Cı — C2 the loop where first 
we go from xp to x following Cı, and then back to xg following C2 in 
the opposite sense). As already discussed after eq. (1.37), for a function 
E that can be written as a gradient, the line integral over a closed loop 
vanishes, so the integral in eq. (4.31) is independent of the path. 

Consider now a particle with charge qa, moving on a trajectory Xa(t) 
under the action of an electromagnetic field. We take the particle as 
non-relativistic, so we can use the Lorentz force equation in the form 
(3.5). In eq. (4.31), for the path C we use the actual trajectory x,_(t) 
of the particle, and we denote its velocity by va(t) = dx,/dt. Since 
Va‘(VaXB) = 0, eq. (3.5) implies that va-F(xXa) = daVa‘E(Xa) or, equiv- 
alently, dXq:F(Xa) = qadXa'E(x,). Then, eq. (4.31) can be written as 

x 
te (3() ~ oo- | F). (4.32) 
Xo 

On the right-hand side, we recognize minus the work made by the 
Lorentz force on the particle. The fact that, as we have seen, the integral 
is independent of the path C connecting xp and x means, in the language 
of mechanics, that the force F, given in our case by the Lorentz force, is 
conservative. 

The work made by an external agent against the electric field to move 
a particle from xp to x is equal to minus the work made by the Lorentz 
force,® so 


Wext = f F (x ) a dx’ 
Xo 


= dalo(x) — $(x0)] - (4.33) 


4.1.4 Instability of a system of static charges 


We now prove Earnshaw’s theorem, which states that, in a finite region 
R, free of charges, the electrostatic potential (x) takes its maximum 
and its minimum on the boundary OR. This theorem can then be used 
to show that (in classical electrodynamics) a system of static charges 
interacting only electromagnetically cannot be in a state of mechanical 
equilibrium. 

The proof of the theorem is as follows. Suppose, to the contrary, that 
(x) has a minimum at a point x in the interior of R. We can then 
construct an infinitesimal volume V, which encloses x and is still inside 
R, and is therefore charge-free. Let OV be the boundary of V. If $(x) 
has a minimum at x, its gradient is such that, for any vector v, v-V¢ is 
strictly positive in x, and remains positive at an infinitesimal distance 
from x. Then, on any point of OV we should have n-V¢ > 0, where ñ 
is the unit normal to OV at that point, and therefore we should have 


| dsn-Vo>0. (4.34) 
OV 
However, this is not possible since 
J @sna-Vd = -f dsn-E 
av av 
-e f Pe V-E, (4.35) 
V 


and this vanishes, since V-E = 0 in a charge-free region. Similarly, one 
shows that there can be no maximum. Actually, even if we have stated 
the theorem in the language of electrodynamics, using the electric field, 
the theorem states, more generally, that any harmonic function, i.e., 
any function ¢ that satisfies V°¢ = 0 in a region R, has its minima and 
maxima on the boundary OR. In fact, from Gauss’s theorem 


f 25a ve = f Eewo 


= f PrV*o, 
V 


and this vanishes if V7¢ = 0 in V. 

An important consequence of Earnshaw’s theorem is that a set of iso- 
lated charges cannot be in a state of stable equilibrium under the action 
of electrostatic forces only. Indeed, consider the electrostatic potential 
(x) generated by a given distribution of charges, localized in a volume 
V, and imagine placing a test charge qa at some position x, inside V, 
where there was no other charge. A stable equilibrium situation is then 
only obtained if é(x) has a minimum (or a maximum, depending on the 
sign of qa) at x = Xa. However, as we have seen, this is not possible. 
Therefore, any point-like charge inserted into a pre-existing electrostatic 
potential cannot be in equilibrium.? Another important application of 
Earnshaw’s theorem will be discussed in Section 4.1.6, where we will see 
that it implies that the electric field inside a hollow conductor vanishes. 


(4.36) 
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°This shows that a classical model of 
matter, based only on point-like elec- 
trons and nuclei interacting with static 
electromagnetic interactions, cannot be 
stable. As we will see in Problem 10.2, 
the same is true also beyond the static 
limit, and a model of an atom made of a 
classical electron rotating around a pos- 
itively charged nucleus decays on a very 
short time scale because of the emission 
of electromagnetic radiation. This is an 
intrinsic limitation of classical electro- 
dynamics, and hints to the fact that, at 
some microscopic scale, the classical de- 
scription must be replaced by quantum 
mechanics. 
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10This follows trivially, expanding 
V-(w~V y) on the right-hand side. More 
generally, given two functions ¢ and w, 
we have the identity 


[ee (VeV +620) 
z 
= | avv) 
V 


and therefore, upon use of Gauss’s the- 
orem, 


i dx (V-V +47?) 
V 


=f ds (879), (438) 
av 

which is called Green’s first identity. 

Rewriting this exchanging ¢ with y we 

have 


I Bx (Vp-Vh+bV4) 


ƏV 


and, subtracting this from eq. (4.38) we 
get 


| Bx (6V7~ — pV") 
V 
= I ds- (¢Vb — YY) , (4.39) 
OV 


which is called Green’s second identity. 


11We will formalize this more precisely 
in Chapter 6, where we will see that, for 
a distribution of charges, the Coulomb 
potential ¢ œ 1/r is the first term in a 
multipole expansion. If the total charge 
of the distribution vanishes, the 1/r 
term is absent and (x) goes to zero 
faster that 1/r. 


4.1.5 Uniqueness of the solution of electrostatic 
problems 


We now ask to what extent the solution of a problem of electrostatics, 
determined by eq. (4.3) together with the geometry of the system and 
the boundary conditions, is unique. Let us assume that there are two 
solutions ġı(x) and @¢2(x) of eq. (4.3). Then, the difference w(x) = 
o1 (x) — 2 (x) satisfies 


V7" =0. (4.37) 
Let V be a volume with boundary V. We use the identity!® 
J de Vy- Vy -| Br [V-(oVy) -¥V7y] . (4.40) 
v v 


Then, using eq. (4.37) for the second term on the right-hand side, and 
Gauss’s theorem for the first, we have 
f Pax |\Vy|? = | ds- (YV y). (4.41) 
V ov 
Consider first the situation in which the source term in eq. (4.3) is lo- 
calized, and the space is just a large volume in three-dimensional space, 
with no inner boundaries, and a boundary OV at large distances from 


the sources. If the distribution of charges is localized, we can take as 
volume Va sphere of radius R enclosing all charges, so 


| ds-~Vw = rè | ane-vve, (4.42) 
ov 


where we used the fact that, for a sphere, ds = R?dQ ê. At sufficiently 
large distances any solution (x) of eq. (4.3) decreases with distances 
at least as 1/r, so V@ decreases at least as 1/r?.1! Therefore, on the 
surface of the sphere, YV y is of order 1/ R? or smaller. Then, taking the 
limit R — oo, the right-hand side of eq. (4.41) vanishes, and eq. (4.41) 
gives 


fewu sÜ, 


where the integral is now over all three-dimensional space. Since |V y|? is 
a non-negative quantity, this can only be satisfied if Vy = 0 everywhere, 
so w must be a constant. This shows that, if ¢)(x) and ¢2(x) are two 
distinct solutions of eq. (4.3), they can differ at most by a constant. A 
constant addition to a potential is irrelevant, since it does not affect the 
electric field, and can be fixed simply by imposing that the solution of 
eq. (4.3) vanishes at infinity. Therefore, the solution of eq. (4.3) for a 
localized distribution of charges, in a space with no inner boundaries, is 
unique. 

This argument can be easily generalized to the situation in which we 
consider a space that, rather than being the whole R3, has one or several 
inner boundaries $;, that could correspond, for instance, to surfaces of 


(4.43) 


material bodies (we will study the case where these bodies are perfect 
conductors in Section 4.1.6). In this case, eq. (4.41) becomes 


3 2 s 
fa x |V] =). ds- Vw. (4.44) 


To have a well-defined problem we must assign boundary conditions for 
¢1(x) and (x) on the inner boundaries $;, which will induce corre- 
sponding boundary conditions on w. There are two natural boundary 
conditions on a surface: 


e Dirichlet boundary conditions: in this case we fix the value of the 
potential ¢ on the surfaces, so we set ¢1(x) = ¢2(x) = fi(x) on the 
surface S;. This implies y(x) = 0 on each surface S;, so the right- 
hand side of eq. (4.44) vanishes. Then, we find again eq. (4.43), 
which implies that Vw = 0 and therefore ~ is a constant. 

e Neumann boundary conditions: in this case we require that, for 
any solution @ of eq. (4.3), the component of V@ normal to the 
surface vanishes, so n-V¢1,2 = 0 and therefore also n-Vw = 0. 
Since ds = dsn, we find again that the right-hand side of eq. (4.44) 
vanishes and that w is a constant. 


Then, also in these cases, the solution for @ is unique, apart from an 
irrelevant constant. 

A related result is that (on a topologically trivial space, such as IR?) a 
vector field V(x) is uniquely determined by its divergence and its curl, 
modulo the gradient of a function w that satisfies V°w = 0. In fact, 
consider the equations 


V-V = f(x), VxV = u(x), (4.45) 


with f(x) and u(x) given. Let Vi(x) and V2(x) be two solutions of 
these equations. Then the vector field w(x) = V(x) — V1(x) satisfies 


V-w=0, Vxw=0. (4.46) 


From the theorem for curl-free fields given on page 7, V xw = 0 implies 
that (on R3) we can write w = Vy. Then, V-w = 0 becomes V7 = 0. 
Therefore, 


Vo(x) = Vi(x) + Vv, with V?~=0. (4.47) 


If, furthermore, the boundary conditions of the problem are such that 
w goes to zero at infinity, then the only solution of V7w = 0 is w = 0, 
and V2(x) = V(x). 

In electrostatics, we have V-E = —p(x)/e9 and V xE = 0, so, in R3, 
p(x) uniquely determines the solution for E(x), modulo the gradient of 
a function that satisfies V7w = 0. For a localized distribution of charge, 
the argument in eqs. (4.37)—(4.43) then shows that Vw = 0, so we find 
again that E is uniquely determined by p(x). In this form, however, the 
theorem extends also to magnetostatics (with a localized distribution of 
currents), which, as we will discuss in more detail later, is governed by 
the equations V-B = 0 and VxB = uoj, so both the divergence and 
the curl of B are given. 
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12%™m Solved Problem 11.1, we will 
show explicitly how to compute V in 
terms of its curl and its divergence, 
see eq. (11.174). In eqs. (13.52) and 
(13.56), we will then use these results 
to give the corresponding solutions for 
E and B in material media, in full gen- 
erality. 
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Fig. 4.1 A conductor in an external 
electric field Eext. Under its action, 
the free electrons move to the sur- 
face and create an induced electric 
field Eina, that screens Eext. 


13Note that this only holds for a static 
situation in which a conductor, subject 
only to a static external electric field, 
has rearranged its surface charges and 
reached an equilibrium situation, where 
the external field is screened. If, in- 
stead, we use a battery to keep a po- 
tential difference between two points of 
a conducting wire (which amounts to 
continually removing the surface charge 
imbalance), a steady current will flow. 


Fig. 4.2 An infinitesimal cylin- 
der across the boundary between 
a conductor (shaded part, marked 
as medium 1) and the vacuum 
(medium 2). 


4.1.6 Electrostatics of conductors 


Consider now a conductor, with volume V and boundary S = V. A 
fundamental property of conductors is that, in a static situation, the 
electric field inside them must vanish. Indeed, suppose that we apply an 
external time-independent electric field. In a conductor, the electrons 
are free to move, and, because of their negative charge, they will move 
in the direction opposite to the applied electric field. They will then 
eventually accumulate on some parts of the surface, which will therefore 
be negatively charged, and deplete other parts of the surface. The latter 
will therefore be overall positively charged because, there, the positive 
charge of the ions (which stay fixed) is no longer fully compensated by 
the electrons, see Fig. 4.1. This charge imbalance creates an induced 
electric field in the direction opposite to the applied electric field. The 
process will continue until the external field is completely screened and 
the total electric field vanishes (we are assuming here an ideal conductor, 
with an infinite supply of free charges; in normal situations this is a 
very good approximation for a metal). From Gauss’s law V-E = p/eo, 
E = 0 implies p = 0. Therefore, at equilibrium, in the interior of 
the conductor positive and negative charges balance perfectly, and any 
charge imbalance is on the surface of the conductor.’ 

Another consequence of this screening process is that the surface of 
the conductor is an equipotential surface, i.e., the potential @ has a 
constant value on the surface. This follows from eq. (4.31), taking xo 
and x to be two points on the surface of the conductor, and C any path 
that connects them passing through the interior of the conductor, where 
E = 0. The fact that ¢@ is constant on the surface also means that, on 
the surface, the components of V@ parallel to the surface vanish. The 
electric field at the surface of the conductor is therefore perpendicular to 
the surface. Again, physically, what happens is that, as long as there is 
a component of E tangential to the surface, the surface charges move in 
such a way as to screen it, until an equilibrium configuration with zero 
tangential electric field is reached. The component of E perpendicular 
to the surface of the conductor can be computed using the integrated 
Gauss’s law, that we rewrite here as 


(4.48) 


We take as volume V the cylinder shown in Fig. 4.2, which straddles 
across the boundary between the conductor and the vacuum. We take 
the z axis along the height of the cylinder, so 


+h/2 
[tem | aas f dz. 
V A —h/2 


+h/2 
J dzp=0 
—h/2 


(4.49) 


In the limit h > 0, 


(4.50) 


is the surface charge density. We take A sufficiently small, so that ø can 
be taken to be spatially constant over A. Then, 


+h/2 
| dzay | dz p = Ao. (4.51) 
A —h/2 


On the left-hand side of eq. (4.48), in the limit h + 0 we have 


+h/2 
| vE = fau] dz 0,E, 
v A —h/2 


+h/2 
Eh dz f dedy (0; Ez + 3y Ey) 
ALE.) — B,(1)] + O(h), (4.52) 


where E, (2) = E, is the value of E, as we approach the conductor from 
the medium 2, here taken to be just the vacuum, and F(1) is the value 
when we approach the boundary from the medium 1. In the case where 
the latter is a conductor, E,(1) = 0. Therefore, sending h > 0 with 
A sufficiently small so that ø is constant over it, but still its linear size 
much larger than h, we find that, on the surface, 


o 
Tiel (4.53) 
€0 
For a boundary with a normal ñ (pointing outward from the conductor) 
in a generic direction, rather than along z, we then have 


n-E = (4.54) 


oO 
Co 


We can now use the fact that on the surface of conductors ¢@ is constant, 
combined with the results of Section 4.1.5, to investigate the uniqueness 
of the solution of electrostatic problems in the presence of conductors. 
Let $1 and ¢2 be two solutions of V? = —p/e€o, and define Y = $1 — ¢2. 
According to the discussion in Section 4.1.5, we want to understand 
under what conditions the integral on the right-hand side of eq. (4.44) 
vanishes. We write E; = —V¢, and E2 = —V¢z2, and we consider 
for simplicity the case of a single conductor with surface S, so that the 
volume V that includes the conductor has S as its inner boundary. Then 


[ wvv = —(61~ 62)\s | ds-(Eı ~ Ba) 
S Ss 
™ ~(61 ~ oa) | de V-(Eı — Es) 
= —(1—da)isx | delo = pa) 
= —(¢1 eo (4.55) 


€0 


where, in the first line, we used the fact that ¢, and ¢2 are constants on S 
and can therefore be carried out of the integral, and we then used Gauss’s 


4.1 
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theorem (1.48) and Gauss’s law (3.1). From this, we see that there are 
two ways of making the left-hand side of eq. (4.55) vanish: (1) we can fix 
the value of the potential ¢ on the surface, so that (¢1—@2))5 = 0. This is 
a Dirichlet boundary condition; (2) we rather fix the total charge on the 
conductor, so that Qı = Q2. As we see from the previous steps, this is an 
integrated form of a Neumann boundary condition, since it corresponds 
to fixing the surface integral of the component of Vo normal to the 
surface (rather than assigning its value at each point of the surface). In 
both cases, the left-hand side of eq. (4.55) vanishes and, by the argument 
in Section 4.1.5, the solution for @ is then unique. 

Finally, consider a hollow conductor, i.e., a conductor which, in its 
interior, has a cavity region R, where no charges are present. In this 
region, the potential satisfies V° = 0. Furthermore, at the boundary 
OR between the cavity and the conductor, we have ¢ = ġo, with ¢o 
a constant, since we have seen that the surface of a conductor must 
be an equipotential surface (and the argument holds independently of 
whether this is an external or an internal boundary surface). However, as 
we discussed in Section 4.1.4, Earnshaw’s theorem states that ¢ cannot 
have minima or maxima inside a charge-free region R. A function that 
is constant on the boundary OR and has no maxima or minima inside 
is necessarily constant everywhere in R. Therefore, in all R, ¢ = ¢o, 
and E = —V@¢ = 0. This is a remarkable result, that shows that, no 
matter the geometry of the inner cavity, the surface charges distribute 
themselves on the boundary in such a way that they screen any electric 
field in its interior. This is the principle at the basis of Faraday’s cage. 

Note, however, that the argument no longer goes through if there are 
charges inside the cavity, since in this case V’¢ 4 0 inside. Therefore, 
the surface charges on the inner and outer surfaces of a hollow con- 
ductor distribute themselves so as to cancel the electric field generated 
by charges outside the conductor, but do not cancel the electric field 
generated by charges in the internal cavity. 


4.1.7 Electrostatic forces from surface integrals 


We now elaborate on eq. (3.69) to show that, in electrostatics, the equa- 
tions of motion of charged particles (or of extended bodies) can be writ- 
ten in terms of surface integrals. We consider a set of extended bodies (or 
point charges) and, for the integration volume that enters in eq. (3.68), 
we choose a volume Va that includes only the a-th body, so that P mech 
is the momentum p, of the a-th extended body. In electrostatics B = 0, 
and therefore Pem vanishes, see eq. (3.67). Then, eq. (3.69) becomes 


(22) == ds ny Ti; . (4.56) 
dt J, Va 


Note that the derivation of eq. (3.69) was completely general and did 
not assume that the particles are non-relativistic. When they are non- 
relativistic, we can also use Newton’s law dp, /dt = Fa, where F, is the 


force acting on the a-th body, and eq. (4.56) can be rewritten as 


(Fa); = = ds ny Ti; 3 


OVa 


(4.57) 


It is remarkable that, in electrostatics, the total electric force acting 
over a body can be computed as an integral over any surface enclosing it 
(so, in particular, over its boundary surface), without apparently know- 
ing the forces on the individual volume elements inside the body.!* In 
electrostatics, the Maxwell stress tensor (3.63) reduces to 


1 
ED (z, = T 


= —€09 (2000 - 5800040) ’ 


Ty = (4.58) 


(4.59) 


where we used eq. (4.30). Using eq. (4.58), we can rewrite eq. (4.56) as 


dpa 2 A E oa 
= E-n)E— —E?n| . 
ga 


This expresses the force on a charge qa as an integral of the total elec- 
tric field (including that generated by the charge qa itself) over a surface 
enclosing the charge. This is, apparently, quite different from the stan- 
dard expression F = q,E’(x,), where we denote here by E’(x,) the field 
generated by all other charges, except qa, evaluated at the position Xa 
of the charge qa. 

As an application, it is instructive to see how eq. (4.60) reproduces the 
Coulomb force between two point charges. We consider for simplicity 
two equal charges qi = q and q2 = q (we take q > 0) at a distance 
2d, and we set their positions at x; = (0,0,+d) and x2 = (0,0,—d), 
respectively. To compute the force on the first charge we take, as the 
volume V; that encloses it, a hemisphere of radius R in the z > 0 region, 
i.e., the volume defined by the conditions z? + y? + 2? < R and z > 0, 
see Fig. 4.3, and we send R — oo. The boundary OV, is then given 
by the union of the (x,y) plane and the surface of the hemisphere at 
infinity. The electric field to be used in eq. (4.60) is the total electric 
field created by the two charges, since this is the quantity that enters in 
the energy-momentum tensor (4.58). As R — oo, E is of order 1/R?. So, 
on the surface of the hemisphere, the term in bracket in eq. (4.60) is of 
order 1/R*, while d?s = R?dQ. Therefore, as R — 00, the contribution 
from the surface of the hemisphere at infinity vanishes, and only the 
contribution from the (x,y) plane matters. This can be computed as 
follows. Consider first the electric field E; (x, y) produced by the charge 
qı in a point x = (a,y,0) of the plane. The squared distance between 
the charge qı located at x; = (0,0,+d) and the point x = (a, y,0) is 
x? + y? + d?, so the modulus of E; (x,y) is given by 


(4.60) 


q 1 
|E: (z, y)| = 


= ; 4.61 
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14The underlying reason is that, in 
electrostatics, the knowledge of the 
field on the boundary uniquely deter- 
mines the field everywhere, as we have 
seen in Section 4.1.5, so, in fact, the in- 
formation on the field inside the body 
is implicitly there. 


OV, 


Fig. 4.3 The hemisphere surround- 
ing the charge qi. The vector v in- 
dicates the direction of the electric 
field generated by the charge qi at 
the point with coordinates (x,y, 0). 
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15 To compute the integral over p we 
introduce u = p?/d? and we use 


T u 1 
du E 
0 (u+1) 2 


(4.65) 


16 4 note for the advanced reader. This 
way of writing the equations of mo- 
tions in term of surface integrals can 
also be performed in Newtonian grav- 
ity, with the gravitational potential 
taking the role of the scalar poten- 
tial @ in eq. (4.59) (and e9 replaced 
by 1/(47G), where G is Newton’s con- 
stant), see eqs. (5.221)—(5.224) of Mag- 
giore (2007). This admits an elegant 
extension to General Relativity, devel- 
oped in a classic work by Einstein, In- 
feld, and Hoffmann in 1938. In Gen- 
eral Relativity, this can provide signifi- 
cant advantages because the dynamics 
can then be written in a way that only 
involves fields at large distances from 
the source, i.e., the weak-field regime, 
independently of the internal structure 
of the sources, where, in General Rela- 
tivity, in particular for black holes and 
neutron stars, complicated non-linear 
effects might take place. 


For positive q, E; (x,y) points in the direction of the vector v given by 


xX, + v = x, see again Fig. 4.3, so v = x — x; = (x,y,—d). The unit 
vector in this direction is therefore 
v= l d 4.62 
Y= Gra e J; ( z ) 
and therefore 
E(z,y) = 2 ; (x,y, -d) (4.63) 
DY = Treo (x? + y? + d?)3/? mae 


The field E2(x,y) generated by the second charge at the point x = 
(x, y,0) is simply obtained by replacing d > —d in the previous expres- 
sion. Therefore, the total electric field E(x, y) = E1 (x, y) + E2(x, y) is 
given by 


2 1 
Eley) = Fo Gap Rae OHO: (4.64) 
Note that, on the (x,y) plane, the electric field has no 2 component, as 
was clear from the symmetry of the problem, since we have taken q1 = q2. 
We now introduce polar coordinates in the (x,y) plane, x = pcos@, 
y = psing. Then, in eq. (4.60), d?s = pdpd¢ and, for the volume V1, 
the outer normal to the plane is nm = —zZ. Therefore, in eq. (4.60), the 
term E-n vanishes and, for the force F; = dp,/dt exerted on the first 


particle, we get!° 


1 2q A co 27 p 
F = — d — 
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E Ameg (2d)? ` 


(4.66) 


This correctly gives the Coulomb force on the first charge, due to the 
second charge at distance 2d. In particular, a force in the +z direction 
on the first charge, which is located at (0,0, d), exerted by the charge at 
(0,0, —d), corresponds to a repulsive force along the line joining the two 
charges, which is the correct result given that we took charges with the 
same sign. 

It might appear that we have killed an ant with a hammer, given the 
long computation performed just for getting back the Coulomb force. 
However, conceptually it is quite interesting to see how the force on a 
charge can be obtained without using the electric field at the position 
of the charge itself, but rather using the total electric field on a surface 
surrounding it, which could even be chosen at a large distance from it. 
Note also that, in the usual computation of the force F4 exerted on a 
charge qı by a charge q2, we have F = qi E2(x1), where E2(x;) is the 
electric field generated in x; by the charge q (or, in general, by all other 
charges present, except qı itself). In contrast, in the computation of the 
force F; from the Maxwell stress tensor enters the total electric field E 
generated by all charges, including q,.'® 


4.2 Magnetostatics 


We now turn to magnetostatics, i.e., situations that involve only static 
magnetic fields. In this case the Ampére—Maxwell law (3.2) reduces to 
Ampére’s law (see Note 7 on page 44), 


V x B= uoj, (4.67) 
and this, together with eq. (3.3), that we repeat here, 
V-B=0, (4.68) 


determines the magnetic field. Observe that, in magnetostatics, only the 
combination uo = 1/(e€oc?) enters. Equation (4.67) implies 

Vj=0, (4.69) 
as we see by taking the divergence of both sides. This is consistent with 
the conservation equation (3.22) if we set the net electric charge density 
p = 0 or, more generally, if p/ðt = 0. The integrated form of the 
Ampére—Maxwell law, eq. (3.19), reduces to 


f dl- B(x) = pol, (4.70) 
c 

where J is the current through any surface bounded by C, while the 
integrated form of eq. (4.68) is still given by eq. (3.16), which we repeat 
here, 


f ds-B(t,x) =0. (4.71) 
5 


4.2.1 Magnetic field of an infinite straight wire 


As a first application, we compute the magnetic field generated by an 
infinite straight wire carrying a steady current J. The problem of com- 
puting the magnetic field produced by a generic static distribution of 
currents will be studied, in full generality, in Section 4.2.2. We set the 
wire along the z direction, and we use cylindrical coordinates (p, p, z), 
as shown in Fig. 4.4. Since the problem is invariant under translations 
along the z axis and rotations around the wire, B(p, y, z) must be of the 
form 


B(p, 9, 2) = By(p)P + By(p)@ + B.(p)z, (4.72) 


where Bp, Bz, and B, are independent of z and y. Writing j = j.(p)z, 
and using eqs. (1.30) and (1.33) for the divergence and curl in cylindri- 
cal coordinates, as well as the fact that B is independent of y and z, 
Ampére’s law (4.67) becomes 


iy 


5 val ae 
—$0,B, + 2 Oo(PBy) = poj2(p)z, (4.73) 
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Fig. 4.4 The cylindrical coordinate 
system centered on the wire (with 
the wire along the z axis), and the 
loop C used to apply the integrated 
Ampere’s law. 


17 Here, we consider the limit of an in- 
finitely thin wire, so we can write the 
current density as jz(x) = Id(x)d(y), 
and the proportionality constant is the 
current I, fixed by I = f dxdy jz. How- 
ever, later, it will be useful to model 
it using a function jz() that vanishes 
for p > a, with a the transverse size of 
the wire, that we will eventually send 
to zero. 
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Fig. 4.5 The cylinder used in the in- 
tegrated Maxwell equation (4.71). 


181p fact, it is not even necessary to 
use this information; even if Bz were 
non-zero, invariance under translation 
along the z direction would imply that 
Bz is the same at z = +h/2, so the 
flux entering from the lower face would 
cancel against that coming out from the 
upper face. 


Fig. 4.6 The geometry of the sym- 
metry argument based on parity 
and subsequent 180° rotations, dis- 
cussed in the text. 


while eq. (4.68) becomes 
1 
7 00(PBp) = 0. (4.74) 


From eq. (4.73), we can immediately conclude that 0,B, = 0 and there- 
fore B, is a constant. As a boundary condition, for physical reasons 
we require that, at infinite distance from the wire, i.e., at p > oo, B, 
vanishes. Then, this constant is actually zero, so B, = 0 everywhere. 

Next, we can prove that the radial component B, also vanishes. This 
can be seen more easily from the integrated Maxwell equation (4.71), 
taking as surface S the boundary of a cylinder of height h and radius 
p > 0, whose axis coincides with the wire, and with faces at z = +h/2, 
see Fig. 4.5. Equation (4.71) states that the magnetic flux through the 
boundary S of the cylinder vanishes. The flux through the faces of the 
cylinder at z = +h/2 vanishes because B, = 0,!° while the flux through 
the lateral surface of the cylinder is 2tphB,(p). Since this must vanish, 
we get B,(p) = 0 (for p 4 0, i.e., outside the wire, that we have taken 
here as infinitesimally thin). 

The vanishing of B, and B, can actually be understood using a sym- 
metry argument based on parity. The argument, however, has some 
subtleties, so it is instructive to go through it in some detail. Consider 
the parity transformation (“II”), x > —x. Since j is a true vector (rather 
than a pseudovector), it changes sign under parity, j(x) > —j(—x). In 
our case j(x) is proportional to a Dirac delta in the transverse plane 
and is independent of z, so the change x — —x in its argument has no 
effect, and simply j ~ —j. So, if before parity j was in the +z direction, 
after the parity transformation it points in the —z direction. Therefore, 
the geometry of this problem is not invariant under parity alone. How- 
ever, we can combine this with a rotation by 180° around the y axis, 
that we denote by R,, that sends the current back toward the positive 
z direction. Thus, the combined transformation R,II is a symmetry 
transformation of the system. 

Consider now how the magnetic field B, at a generic point x is space, 
transforms under this combined operation. We found in eq. (3.95) that 
B is a pseudovector under parity, i.e., for a static field, B(x) > +B(—x). 
Note that its argument changes from x to —x, i.e., the parity operation 
sends the point P = (x,y, z) into the antipodal point P’ = (—2, —y, —z). 
We assume, without loss of generality, that the point P has coordinates 
(a = 0, y, z) and we consider, for definiteness, a magnetic field that, in P, 
is the sum of three vectors Bz, By, and B, pointing, respectively, toward 
the positive x, y, and z axes, as shown in Fig. 4.6. Note that, having 
set x = 0, a vector in the positive x direction corresponds to a clockwise 
azimuthal component, and a vector in the positive y direction to an 
outward radial component. After the parity operation, the three vector 
components of the magnetic field B’ at P’ will be as shown in the figure, 
ic. B}, B}, and Bi, would still point in the same directions as before 
the transformation, since B — +B; however, now B; corresponds to 
an inward radial vector, and B/, to a counterclockwise azimuthal vector. 


After the subsequent R, rotation, the point P’ is transformed into the 
point P” in Fig. 4.6 and, because of the 180° rotation in the (a, z) 
plane, B = —Bj, and BY = -B/, while B} = B}. The three vector 
components of the magnetic field at P” will therefore be as shown in the 
figure. We can finally perform a 180° rotation around the z axis, Rz, 
that brings the point P” back onto the initial point P. Thus, in the end, 
after the combined R, Rl transformation, eventually x — x, while 


B,(x) > —B,(x), B(x) >+B,(x), B(x) >—B,(x). (4.75) 


Thus, we have found a transformation that leaves the system invari- 
ant, and such that, under it, both B,(x) and B,(x) change sign; one 
is therefore tempted to conclude that they must vanish. While, as we 
have seen from the explicit computation, this conclusion is indeed true, 
the correct logic still requires some more steps. Indeed, one can already 
be perplexed by the fact that, with this argument based on symmetries, 
it seems that we do not even need to impose the boundary condition 
B, = 0 at infinity, that was necessary in the derivation from eq. (4.73). 
After all, symmetry arguments can be an elegant way of extracting the 
consequences of the equations of a theory, but do not contain more in- 
formation than the equations themselves. In fact, the correct chain of 
reasoning here is as follows. First of all, it is useful to stress that, behind 
any use of arguments based on parity invariance, stands the fact that 
Maxwell’s equations are indeed invariant under parity, as we discussed in 
Section 3.4.!° Second, the invariance of the geometry of the problem and 
of the equations governing a theory under a transformation, is not yet 
enough to guarantee that the corresponding solutions will also be invari- 
ant under this transformation. A necessary condition is that even the 
boundary conditions will be chosen to be invariant under the symmetry 
transformation. Maxwell’s equations are invariant under parity and un- 
der rotations; still, a solution of Maxwell’s equations invariant under this 
combination of parity and subsequent 180° rotations, can only emerge 
if also the boundary conditions are invariant under it. In our problem, 
a natural boundary condition on B, is imposed at p — ov, and the only 
such boundary condition that respects this symmetry is B.(p = co) = 0, 
which is indeed the condition that we imposed, for physical reasons, 
when searching the solution of eq. (4.73). In principle, we could rather 
decide to impose a boundary condition B,(p = œ) = By # 0; this 
would break the symmetry (in particular, the parity trasnformation). 
The corresponding solution for B,(p), which in this case would simply 
be a constant equal to the boundary value, B.(p) = Bæ, would not be 
invariant under parity. Still, at the mathematical level, it would be a 
perfectly well-defined solution of Maxwell’s equations. For Bp, instead, 
the correct boundary condition is B,(g = 0) = 0, since a radial field that 
does not vanish at = 0 would be mathematically ill-defined. Again, 
this boundary condition respects the parity and rotation symmetries. 
Last but not least, in general, even when the geometry of the prob- 
lem, the equations of the theory, and the boundary conditions respect 
a symmetry, this does not yet necessarily imply that the solution re- 
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19 As we stressed there, despite its ap- 
parently obvious geometric nature, in- 
variance under parity is not guaranteed 
a priori. Electrodynamics happens to 
respect it, but, for example, another 
fundamental interaction, the weak in- 
teraction, is not invariant under parity; 
we do not perceive this in the macro- 
scopic world just because the range of 
weak interactions is limited to subnu- 
clear scales. 
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20See also the discussion on sponta- 
neous symmetry breaking in Note 39 on 
page 86. 


21 More accurately, they are covariant, 
i.e., the left- and right-hand sides of the 
equations transform in the same way. 
See Note 15 on page 50. 


Fig. 4.7 Some field lines of the solu- 
tion for the magnetic field. 

22t is also interesting to find the cor- 
responding solution for the vector po- 
tential. Using the expression (1.33) for 
the curl in cylindrical coordinate, one 
can see directly that the magnetic field 
(4.79) can be obtained from 


A(x) = Az(p)4, (4.80) 
with 
Az(p) = — Hot Inp. (4.81) 


This gauge potential satisfies the 
Coulomb gauge condition, since V-A = 
0: Az(p) =0. Note that A grows with- 
out bounds at large distances from the 
wire. This is an artifact of having taken 
a current distribution that is not local- 
ized, and rather extends from z = —oo 
to z = +00. We will see in Section 6.2 
that, for a localized current distribu- 
tion, A vanishes at infinity. 


spects it. This only happens if the solution is unique. The other option 
is that there is a family of solutions, that transform into each other 
under this symmetry transformation.?° For instance, in our case, using 
only symmetry arguments, one could not exclude the possibility that, 
even once imposed the boundary condition B,(p = oo) = 0, there could 
be two solutions, one with B,(p) > 0 and one with B,(p) < 0, that 
both approach zero at large p, and that transform into each other under 
parity. In our case this does not happen because, as we have seen at 
the end of Section 4.1.5, in magnetostatics, once assigned the boundary 
conditions, the solution is unique. So, the complete logic of the sym- 
metry argument is: (1) Maxwell’s equations are invariant under parity 
and under rotations.?! (2) The problem is invariant under a combina- 
tion of parity and 180° rotations. (3) For physical reasons, as boundary 
condition on B, we choose B,(p = oo) = 0, while on B, we impose 
B,(p = 0) = 0 for mathematical consistency; these boundary conditions 
are invariant under the combined parity and 180° rotation transforma- 
tions. (4) In magnetostatics, once given the boundary conditions, the 
solution is unique. Then, we can finally conclude that the solution must 
be invariant under this combined parity plus rotation transformation 
and, since under this transformation B, and B, change sign, they must 
vanish. 

In conclusion, either from the explicit computation using the inte- 
grated Maxwell’s equations, or from the symmetry argument, we find 
that in cylindrical coordinates the only non-vanishing component of the 
magnetic field is B,(), and eq. (4.72) becomes 


The function B,() can now be determined using the integrated form 
of Ampére’s law, eq. (4.70). Taking as curve C a circle of radius p in 
the plane transverse to the wire and centered on the wire, as shown in 
Fig. 4.4, in eq. (4.70) we have dé = pdy &, so 


2m 
p| de Belo) = mot, (4.77) 
0 
and therefore I 
Bo(p) = Prp (4.78) 
In conclusion, 
I 
B(p, 9,2) = HO ora P. (4.79) 


Some field lines of this solution in a transverse plane are shown in 
Fig. 4.7.22 

It is instructive to see how the same result can be obtained from a 
direct integration of eqs. (4.73) and (4.74), since some subtleties ap- 
pear. Equation (4.74) tells us that pB,(p) = a, with a a constant, so 


B,(p) = a/p. Setting a = 0 we get back the result B,(p) = 0 found 
previously. However, it seems that any field of the form B,(p)p = ap/p, 
with arbitrary a, would give an acceptable solution of eq. (4.74). This, 
however, is not the case, because eq. (4.74) becomes singular at p = 0, 
and more care is needed. To compute the divergence of a vector field 
that, in cylindrical coordinates, has the form v(p, y, z) = p/p, we pro- 
ceed similarly to what we did in eqs. (1.86) and (1.87) for a field that, 
in polar coordinates, had the form ĉ/r?. In the present case, we take as 
volume V the same cylinder used in Fig. 4.5, with radius p and height 
h. Its lateral surface has a surface element ds = dzpdyp. Then, using 


Gauss’s theorem 
| ds-v 
Əv 


f rvw 
V 
h/2 Qn 
/ dz | pdpp: 
—h/2 0 


= 2mh. (4.82) 


DID 


We now write d°z, on the left-hand side of eq. (4.82), as d'z = dzd?x_, 
where x; = cX+ yy is a two-dimensional vector spanning the transverse 
plane. Then, the left-hand side of eq. (4.82) can also be written as 


h/2 
[evr az | dx, Vv 
V —h/2 |xi|<p 


= nf dx, Vv, (4.83) 
|xL|<p 


where the integral over dz is trivial because, for the vector field v(p, y, z) = 


p/p that we are considering, V-v is independent of z. Since this holds 
for arbitrary p > 0, and therefore also for p infinitesimally small, com- 
parison of eqs. (4.82) and (4.83) shows that 


V-v = 275) (x1), where v(p,y,z) = (4.84) 


dl 


Therefore, a magnetic field of the form B(p, y, z) = ap/p would satisfy 


V-B = 2ra 5) (x1) (4.85) 


so it is not a solution of V-B = 0, unless a = 0. 
Finally, we can determine B, from a direct integration of eq. (4.73), 
whose Z component is 


(PBe) = Hojz(p)p- (4.86) 


Rather than taking an infinitesimally thick wire, it is simpler here to 
take a model for a circular wire of radius a, with ją constant for p < a 
and j, = 0 for p > a. Then, integrating eq. (4.86) from p = 0 to 
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231 should be appreciated how, in 
this problem, the use of the integrated 
Maxwell’s equations provided a simpler 
and more elegant way of obtaining the 
solution for Bp and By. In particular, 
in the direct integration of eq. (4.73), 
we had to deal with the subtle issue of 
the spurious solution B,(p) = a/p with 
non-vanishing a that, as we have seen, 
generates a Dirac delta on the z axis in 
V-B, and is therefore not acceptable. 
In the integrated form of the Ampére 
law, in contrast, we only dealt with 
the field at a surface of the cylinder 
used in the integrated form of the equa- 
tion, and the spurious solution never 
appeared. Notice also that, for com- 
puting By from the direct integration 
of eq. (4.73), we had to make a choice 
for the functional form of j inside the 
wire, that we just took to be a constant 
for p < a. The fact that the radius a 
of the wire eventually disappeared from 
the right-hand side of eq. (4.89) is an 
indication of the fact that the result, 
outside the wire, is independent of the 
modelization used; however, the inte- 
grated form of the Ampére law makes 
this apparent, since no modelization of 
the wire was ever needed. 


24 As in the electrostatic case, we could 
add an arbitrary time-dependent solu- 
tion of the equation OA = 0 that, as we 
will see in Chapter 9, describes plane 
waves. Here, we only consider the solu- 
tion sourced by the current. 


25 Recall that, in Cartesian coordinates, 
the Laplacian of a vector field has the 
same form as the Laplacian on a scalar 
field on each component, so we have 
V7? A? = —poj*, with V? = 62+02+0? 
just as on scalars. Correspondingly, 
also the Green’s function for each com- 
ponent is the same as in the scalar case. 
As discussed in Note 2 on page 4, this is 
no longer true in polar or cylindrical co- 
ordinates. However, working in Carte- 
sian coordinates suffices to obtain the 
Green’s function and solve eq. (4.91). 
Then, since the solution is written in a 
vector form, it holds in any coordinate 
system. 


p = a (with the boundary condition B,(p = 0) = 0, necessary because 
a non-vanishing ¢ component would be singular at p = 0) we get 


HoJzP 
Bg(p) = = ’ (p < a), (4.87) 
so I 
Ho 
B = 6) = —=— 4. 
o(p =a) =F, (4.88) 


where I = jra? is the current flowing through the wire (recall that j 


is a current per unit surface). Outside the wire, i.e., for p > a, jz = 0, 
and eq. (4.86) gives B, x 1/p. The proportionality constant is obtained 
requiring continuity at p = a, which then gives 


Hol 
ad 


= 4. 
o (4.89) 


Be(p (p 24), 
so we have recovered eq. (4.78) outside the wire (and we can now also 
send a — 0 to describe an infinitesimally thin wire carrying a fixed 


current [).?8 


4.2.2 Magnetic field of a static current density 


We now compute the magnetic field produced by a current density j(x), 
that we take localized in space and time-independent, but otherwise 
arbitrary, and we also set p = 0 for the total charge density. According 
to eq. (3.22), this implies 

Vj=0. (4.90) 


This situation is realized, for instance, if we have a wire forming a loop, 
with a steady current flowing through it. Within the wire, the positive 
charge density of the ions compensates for the negative charge density of 
the flowing electrons, so there is no net electric charge density. However, 
the ions do not have a bulk motion, while the electrons do, thereby 
creating a net electric current. The condition V-j = 0 expresses the 
fact that electrons are neither created nor destroyed inside the wire and, 
in each infinitesimal volume inside the wire, bounded by surfaces Sı 
and Sə in the transverse directions and of length dl in the longitudinal 
direction, the flow of electrons entering through Sı is compensated by 
the electrons flowing out through So. 

It is convenient to work in terms of the gauge potentials, in the 
Coulomb gauge. Since p = 0, the solution of eq. (3.93) is ¢ = 0 and, 
looking for a static solution A(x),?4 eq. (3.94) reduces to 


V7A = -poj (4.91) 
The solution is obtained again from the Green’s function (4.15) of the 
Laplacian, and is given by?” 
Ho f 93.7 JX) 
A — | Ër —— 
=. J e (4.92) 


as long as the integral converges at infinity. This is the case, in partic- 
ular, with our assumption that j(x) is localized in space. As a check, 
observe that this solution was found by writing Maxwell’s equations in 
the Coulomb gauge V-A = 0, and therefore must satisfy this gauge 
condition. This indeed holds, thanks to eq. (4.90). In fact, 


Lo Bod Wise 1 

m feui) *|x— 

— ee eek N a ae 
7 fof aa’ ix) *\x— 


= B fate’ Wei] 


Ar |x — x’| 


= 0, 


Vx A(x) = 


(4.93) 


where we added to the V operator a label x or x’ to stress on which 
variable it acts. We then integrated Vx: by parts (setting to zero the 
boundary terms, by the assumption that j(x) is localized), and we finally 
used eq. (4.90). Then, from B= V x A, 


Ho 31 JX’) 
B(x) = — dr — 4.94 
w= vx (fer E). (4.94) 
or, equivalently,?° 
_ Ho f g Sx!) x(x — x’) 
B(x) = me fa tse (4.95) 


As an application, consider the situation in which the current is carried 
by a thin wire that forms a closed loop. We idealize the thin wire as 
a loop C of zero thickness and, at a generic point x € C, we denote by 
dé the vector tangent to the loop (after having chosen one of the two 
possible directions for running along the loop C), and of infinitesimal 
length dé, see Fig. 4.8. We also denote by ege the unit vector in the 
direction of d£, so that 


dl = dléae. (4.96) 


We denote by x, the two dimensional Cartesian coordinates orthogonal 
to dé at the point x so, for instance, if in x we have dl = déx, then 
xı = yy+zz. It is convenient to introduce a one-dimensional coordinate 
£ that parametrizes the position of a point on the loop C, as follows. We 
arbitrarily choose a point P in C, and we assign to it the value £ = 0. 
This corresponds to a choice of origin for this coordinate. Given another 
point P’ € C, we assign it the coordinate £ given by 


tay dé, 
P 


where the integral is a line integral along C. We have therefore con- 
structed a convenient coordinate system (¢,x1), useful along the loop 
and in its immediate neighborhood.?’ The idealization of zero thick- 


(4.97) 
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26 Explicitly, 
3.7 Ik(x') 
By(x 6;n0; | Bax! 
il ) 4 ijk sf jx — x’ 
Ho BaT 1 1 
= —E;~ | Ëx 7, (x’) 0; ——— 
An al jk ) ix —x’| 
/ 
Ho eer ae py Ba SG 
=-—&; d g x) 
in aj jkl eT: 
Ho Bol a (gh) Zk T Th 
=+—&; d?a’ 9 (x 
An af 55 ( ee x3 


where we used eq. (4.19). 


x 


Fig. 4.8 A graphical illustration of 
the definitions of d@ and x_, for a 
loop C. 


27 Actually, in general this coordinate 
system is well defined (in the sense 
that it is in one-to-one correspondence 
with the x coordinates of the three- 
dimensional space in which C is em- 
bedded), only in a sufficiently small 
transverse region around the loop. In- 
deed, for a closed loop, starting from a 
point P on the loop with coordinates 
(£ = 4:,x, = 0), and moving in the 
transverse direction, for some value x% 
of the coordinates x | we would eventu- 
ally reach another point P’ of the loop, 
corresponding to a value £2 of the £ co- 
ordinate. Then, the coordinate values 
(€=41,x1 = xï ) and (l = f2,x, = 0) 
would describe the same point in three- 
dimensional space. However, in the fol- 
lowing, we only use this coordinate sys- 
tem in an infinitesimal neighborhood of 
the loop, so the problem does not arise. 
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28 Sometime, the name “Biot—Savart 
law” is given directly to the more gen- 
eral expression (4.95). 


ness means that the current density j(x) = j(¢,x_) is a two-dimensional 
Dirac delta in the transverse directions, i.e., 


j(Z,x1) = 1(0)6?) (x1 eae, (4.98) 


where J(¢,x, = 0) = I(¢) could a priori still be a function of £ (while 
the multiplication by 5°) (x ) allows us to set to zero its argument x, ). 
However, the condition V-j = 0 implies that I (£) is actually independent 
also of @. For instance, if at the point x we orient the axes so that 
dé = dx $, in x we have j = (j,,0,0) and current conservation becomes 
Oxjx = 0. Since in this case dé = dx, this condition is equivalent to 

dI(£) 

dé 

and this holds on a generic point on the loop, i.e., for £ generic. There- 
fore, for an infinitesimally thin wire 


jx) = 16) (xi )éae, (4.100) 


=0, (4.99) 


with J a constant. Recall that j(x) is a current per unit surface. Since 


[exile i ) = Tage, (4.101) 


I is just the current flowing in the wire. Its independence on £ means 
that, in a single closed loop, the current is the same at all points of the 
loop. Sufficiently close to a point on the loop labeled, in three-space, by 
the coordinates x, the variables x, and £ form an orthogonal system of 
coordinates, so we have d?x = d?x dl. Therefore 


j(x)Px = 15°) (x, eae d?x de 
15) (x) d?x, de, (4.102) 


where we used eq. (4.96). Inside an integral, we can first carry out the 
integration over d?x, with the help of the Dirac delta; this amounts to 
replacing 


j(x)d?a > Ide, (4.103) 


while setting x, = 0 in any occurrence of x, in the integrand. Equa- 
tions (4.92) and (4.95) can then be written as loop integrals, as 


Lol 1 
A(x) = 7 fau KĀ’ (4.104) 


and 


I déx |x — x(£ 
B(x) = A $ a (4.105) 


respectively, where x is the generic point in space where we compute the 
field, and x() is the coordinate in three-dimensional space of the point 
of the loop parametrized by Z. Equation (4.105) is called the Biot-Savart 
law.?® 


4.2.3 Force of a magnetic field on a wire and 
between two parallel wires 


We now compute the force exerted by an external magnetic field B on 
a wire currying a current, and we will then use this result to compute 
the force between two parallel wires. In Section 4.2.4, we will compute 
the force between two arbitrary current distributions. 

Consider a line element dl = dl êaqg of the wire, see eq. (4.96). We 
denote by ne the number of electrons per unit length in the wire, all 
taken to have a velocity v = —vége. Given that the electron charge is 
q = —e < 0, we have qv = evégg, so the current flows in the direction 
+ê4e.”? The Lorentz force (3.5) on a single electron, due to the external 
magnetic field, is given by F = (—e)(—véagg) xB = evéggxB. The total 
charge contained in a line element dé is (—e)n-dé, so the force on it is 
dF = (—enedt)(—véae) x B = (enev) dl ége x B. In a time dt, the total 
charge passing through a transverse surface of the wire is dQ = ene(vdt), 
so the current J = dQ/dt flowing in the wire is J = enev. Therefore 


dF = Idt x B, (4.106) 
or, equivalently, 
dF 
rT = ÍI €ae x B, (4.107) 
and the total force on the loop is? 
P=1$ dexB. (4.109) 
d 


More generally, if we have a current density j(x) not necessarily con- 
fined to a one-dimensional wire, we can repeat the same argument with 
j(x)d?x replacing [dé [compare with eq. (4.103)], and we get?! 


F = f ax i(x)xB(x). (4.110) 


We next compute the force between two infinite straight parallel wires 
carrying steady currents J; and Iz. We consider first the case where 
the two currents are in the same direction, that we take to be the +z 
direction, and we denote by d the distance between the two wires. We 
denote by dF2/dl the force per unit length on the second wire (which 
we take parallel to the z axis and located at a distance d from it) due 
to the magnetic field of the first wire (again parallel to the z axis and 
located in x, = 0). From eq. (4.107), 


dF 
——? = hê x Bi, 


a (4.111) 
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29 This is not how currents really flow 
in metals. As we will discuss in 
Section 14.4, individual electrons un- 
dergo collisions on microscopic distance 
scales, and v should really be under- 
stood as a macroscopic drift velocity of 
the ensemble of electrons, not as the ve- 
locity of individual electrons. However, 
once understood in this average sense, 
this derivation of the force on a wire is 
correct. 


30 Observe that, if B(x) = Bo is spa- 
tially constant, the total force on a 
closed wire vanishes, 


F= ($ at) x Bo =0, (4.108) 
Cc 
since fo dé = 0. 


31 Actually, we already knew this result 
from the derivation of momentum con- 
servation in eq. (3.68). Rather than 
deriving eq. (4.109) from the Lorentz 
force on the individual electrons, we 
could have equivalently started from 
eq. (3.68) to write eq. (4.110) and then, 
specializing to a wire using eq. (4.103), 
we get eq. (4.109). 
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32Note that, in order for {p, ,2} to 
be a right-handed frame, we must have 
Êx = +4, so 2 x $ = —p. This 
convention has already been implic- 
itly used when we derived the solution 
(4.79), since it is implied by the expres- 
sions (1.30) and (1.33) of the divergence 
and curl in cylindrical coordinates, see 
Note 1 on page 4. 


33-This is the case, in particular, if the 
two current densities are spatially non- 
overlapping, as, for instance, in the case 
where they are confined to two different 
wires. 


where B, is the magnetic field generated by the first wire. Using eq. (4.79) 


gives 
dF» be h A 
ry = x es 
dl led a al 
Ho hh , 
= SS 4.112 
5 gq È (4.112) 
where we used % x @ = —p.°” We have therefore recovered eq. (2.7), that 


we anticipated in Section 2.1. This shows that the vacuum magnetic 
permeability 4o, that was originally defined from eq. (2.7), is indeed the 
same as the constant po that appears in the Ampère law (4.67). Note 
that, if the currents are parallel, the force is attractive. The case of 
anti-parallel currents can be obtained reversing the sign of IŻ in the 
previous derivation, and therefore the force becomes repulsive. 


4.2.4 Force between generic static current 
distributions 
We can now compute the magnetic force between two arbitrary static 


current densities j;(x) and jo(x). We set again pı = p2 = 0, and we 
take the currents to be separately conserved,** so that 
V-ji = V-jo = 0. (4.113) 


From eq. (4.110), the force F; exerted on the current density jı by the 
magnetic field generated by the current density jg is 


F, = J @inxBalx). (4.114) 
Using eq. (4.95), 
B2(x) = © / fini AIA) É E7 SA (4.115) 
and therefore 
B = Bo fanaa! HE ee (4.116) 
= 10 Badia! dio = Sees (x— x’) . 


where we used eq. (1.9) to expand the triple vector product. We now 


observe that 
1 
[bein x ji (x 
x- x] 


3 ji(x )-(x — x’) 
[eo 
1 
fer Vxjı( axi] 


= (4.117) 


where we used eq. (4.19), we integrated by parts discarding the boundary 
term (which is valid if j, (x) is localized, or, more generally, if it decreases 
faster than 1/|x| as |x| — oo), and we finally used current conservation 
in the form (4.113). Therefore, eq. (4.116) simplifies to 


x= x’ 


|x — xj’ 


F, = -8 | dada’ ilaia) (4.118) 


Equation (4.118) is the analogue of eq. (4.26) for the magnetostatic case. 
For two current loops Cı and C2, using eq. (4.103), we get 


wae d g _x(41) — x(42) _ 
Fi = dé, -dé 4.119 
o Jey OY xh) — x()P on) 


where x(¢,) is the spatial coordinate of the points of the loop Cı labeled 
by the coordinate ¢; along the loop [defined as in eq. (4.97)], and x(¢2) 
is the spatial coordinate of the points of the loop Cz labeled by the 
coordinate £2. 

Similarly to the electric case [see the discussion following eq. (4.26)], 
the force Fə exerted on the current density jg by the current density ji 
is obtained exchanging 1 + 2 in the previous expression, 


_ Lo 3. 73 rs : 1 x—x! 
and, renaming x + x’, we see that Fə = —F,, so the magnetostatic 


force satisfies Newton’s third law. We also observe that, if we set jo(x) = 
ji(x), the integral vanishes because it becomes odd under x + x’. This 
shows that a current distribution does not exert a force on itself. 

To make contact with the setting of Section 4.2.3, we now choose 
ji(x’) = 1,6@)(x', )@ and jo(x) = 8P (x1 — d1)Ż [compare this with 
eq. (4.100)], where d1 is a vector of modulus d in the transverse (x,y) 
plane. This corresponds to two infinite straight wires with parallel cur- 
rents, separated by a distance d = |d,| in the transverse plane. For 
definiteness, we take d, along the æ axis, so that dı = (d,0). Writing 
dz = d*x, dz and da’ = d*x', dz’, eq. (4.120) gives 


o0 _ , 
de: -e42 f if MORO) (4.121) 
-œ — |¥1(2’) — x2(2)|? 


where, having performed the integration over d?x, and d*x’, with the 
help of the Dirac deltas 5(?)(x’,) and 6®) (x1 — d1), we have x9(z) = 
(d,0, z) and x;(z’) = (0,0, 2’). Carrying out the integral,?+ we get 


dF 2 Ho Tle .. 
x. 


dz md on) 


Since we have chosen p = xX, where, in general, p is the unit vector in 
the radial direction of the transverse plane, from the wire 1 sitting at 


34 Explicitly, we write x2(z) — x1(2’) 
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(d,0,z — 2’), and therefore 
dF 2 Holy Ie 
dz Ji 4T 


S7 dz! (d,0,2— 2')i 
eo [ 


d2 + (z — z')2]3/2 ` 


Passing to the integration variable u = 
(z — z')/d, we see that the third com- 
ponent in this vector expression van- 
ishes because the integrand is odd in u, 
while, for the first component, we get 


co 
f dz 
—oo 


E ree 
[d2 + (z = z!)2]3/2 


1 
fà [1 + u2]8/2 
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35The difference between the behavior 
of the Coulomb force between two point 
charges, which decreases as 1/r?, and 
the magnetic force between two infi- 
nite straight parallel wires, which de- 
creases with distance only as 1/d, is 
due to the different structure of the 
sources, which is a three-dimensional 
Dirac delta for a point-like charge, 
but just a two-dimensional Dirac delta 
in the transverse plane, extending in- 
finitely in the longitudinal direction, for 
an infinite straight wire. Note, how- 
ever, that when the current densities 
are localized, as for two well separated 
wire loops, the factor ji(x)-jo(x’) in 
eq. (4.118) gets both positive and neg- 
ative contributions, depending on the 
relative orientations of different por- 
tions of the wires, and therefore there 
are partial cancellations. As a result, 
the magnetic force decreases faster than 
the Coulomb force between two charge 
densities p1(x) and p2(x’) with fixed 
signs (and is very sensitive to the rel- 
ative orientation of the loops). We will 
come back to this when we study the 
expansion in electric and magnetic mul- 
tipoles, in Chapter 6. 


x, = 0 toward the wire 2 (and dz = dé, since the wires are parallel to 
the z axis), we see that we have recovered eq. (4.112). 

It is interesting to observe that the results for the electrostatic and 
the magnetostatic forces, eqs. (4.26) and (4.118), are completely anal- 
ogous, except for the overall sign, which is such that the electric force 
between charges of the same sign is repulsive, while the magnetic force 
between parallel currents is attractive. The similar structure of the in- 
tegrals comes from the fact that in both cases we had to solve a Laplace 
equation, for the scalar or the vector potential. The relative minus sign 
can be traced to the different nature (scalar or vector, respectively) of 
the scalar and vector potentials, resulting in different tensor structures 
of the indices involving the V operator, see in particular the minus sign 
coming from the triple vector product in eq. (4.116).°° 


4.2.5 Magnetic forces from surface integrals 


We now show that, similarly to the situation discussed in Section 4.1.7 
for electrostatics, the force exerted by a static magnetic field on a lo- 
calized current distribution can be written as a surface integral, on a 
surface enclosing the source. 

We start from eq. (4.110). We observe that, in that equation, B was an 
external magnetic field acting on a current j. Below eq. (4.120) we have 
found, however, that the force generated by a current distribution on 
itself vanishes. We can therefore extend eq. (4.110) to a generic current 
j, possibly made of several disjoint contributions, and we can take B as 
the total magnetic field generated by j. We can then use Ampére’s law 
(4.67), so that 


1 
F = — | ds (V x B)xB. 4.123) 
Ho 
Using the identity (1.7), we have 
(V x B) xB]; = By Op Bj = (0; By) Br : 4,124) 
Using V-B = 0, we can rewrite Bkôk Bi = ôk(Bp Bi), so 
1 
(V x B) xB], = ôk (BiB, z ðB?) ' 4.125) 
Therefore, if V is any volume such that the j vanishes outside it, 
1 1 
F; = —] rô (2:2; — T 
Ho Jv 2 
1 2 1 2 
= — d SNk Bi Bk >= -ôi B ; (4.126) 
Ho Jav 2 
or, in vector form, 
ped d G n)B 5B 
— s . E Gas 5 
ty dae 2 (4.127) 


which is the magnetic analog of eq. (4.60). The force is then expressed 
in terms of the magnetic part of Maxwell’s stress tensor (3.64). 


4.3 Electromagnetic induction 


We now take a closer look to Faraday’s law, in the integrated form 
(3.21). This law governs the induction phenomena on which are based 
ac generators, transformers, etc., and therefore has a fundamental role 
in electrical engineering and all the related technology. Here we will 
limit to two basic examples that put in evidence the basic principles. 


4.3.1 Time-varying magnetic field and Lenz’s law 


Equation (3.21) tells us that, if the flux of the magnetic field through 
a surface S, with boundary C = OS, changes in time, this induces a 
circulation of the electric field along the closed curve C. Therefore, a 
time-varying magnetic field generates an electric field, which is called the 
“induced” electric field. We are interested, in particular, in the case in 
which C is the loop made by a conducting wire. As the simplest example, 
we consider a wire in the (a, y) plane, at rest in our reference frame, and 
a magnetic field Bext in the z direction, see Fig. 4.9, and we increase 
the magnetic field with time. Then, there will be an induced electric 
field in the wire. This will generate a current in the wire (the “induced 
current”) and, in turn, a wire carrying a current generates a magnetic 
field (the “induced magnetic field”). Since we will eventually use the 
approximation of magnetostatics to compute the induced magnetic field, 
the following computation is only valid if the change in time of the 
external magnetic field is quasi-adiabatic. 

Lenz’s law states that the induced magnetic field has the direction 
that opposes the change of the flux of the external magnetic field. To 
see how this comes out from eq. (3.21), recall that the flux ®g is defined 
choosing an orientation for the normal of the surface S; for the surface 
bounded by the wire in Fig. 4.9, let us choose its normal, for instance, 
in the positive z direction. From Stokes’s theorem, this choice then fixes 
the direction of the line element d£ in eq. (3.21), according to the “right- 
hand rule”: if we close the right hand along the direction of integration 
of the loop C, the thumb must point in the direction of the normal to 
the surface. Therefore, choosing the +z direction for the normal of the 
surface implies that the line integral on the left-hand side of eq. (3.21) 
runs counterclockwise. In our setting, we have chosen Bext = Bext (t) 
with Bext(t) > 0 and dBext/dt > 0, and, having chosen the normal n to 
S equal to +z, so we have @g.., > 0 and d®z.,, /dt > 0. Therefore, from 
eq. (3.21), along the wire the electric field must point in the clockwise 
direction, so that dé-E < 0, in order to compensate for the minus sign 
on the right-hand side.°° 

This electric field generates in the wire a current j in the direction 
of E,°” which therefore also runs clockwise, as shown in the figure. In 
turn, this current generates a magnetic field (the “induced magnetic 
field”) Bina. From eq. (4.105), one can see that the induced magnetic 
field circulates around the wire, in the direction shown in Fig. 4.9; at a 
sufficiently small distance from the wire, where it can be approximated as 
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Fig. 4.9 The induced current and in- 
duced magnetic field created by in- 
creasing the flux of an external mag- 
netic field through the closed loop C. 


36Had we chosen —% as the direction 
of the normal to S, we would now have 
®p negative and increasing in absolute 
value, so d®p/dt < 0. Then, overall, 
the right-hand side of eq. (3.21) would 
now be positive. On the other hand, 
with this choice of the normal, the line 
integral in Stokes’s theorem would run 
clockwise. Since we would now need 
dé-E > 0 to match the sign on the 
right-hand side, we would still conclude 
that E points in the clockwise direc- 
tion. Of course, the direction of the 
induced electric field does not depend 
on our arbitrary choice of the direction 
of the normal to the surface S. 


37 As we will discuss in Section 13.6.2, 
for a simple conducting wire the current 
is given by Ohm’s law, j = aE, where 
the positive constant o is the conduc- 
tivity of the material. 
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Fig. 4.10 A closed loop evolving in 
time, and two surfaces that have 
C(t) and C(t + dt), respectively, as 
boundaries. 


straight, this can also be seen more simply using the result for a straight 
wire found in Section 4.2.1, see Fig. 4.7. So, inside the wire, Bing is 
in the direction opposite to the external field. It therefore generates a 
flux that goes in the direction of opposing the increase of the flux of the 
external magnetic field, which is the content of Lenz’s law. 

Note that Lenz’s law states that the induced magnetic field opposes 
the change of the flux, not the flux itself. For instance, if we repeat the 
same argument as before starting from a given magnetic field Bext = 
BextZ, with Bext > 0, and we decrease Bext toward zero, we find that 
the induced current will flow counterclockwise, so, in the direction that 
tries to restore the original value of the field inside the loop. 


4.3.2 Induction on moving loops 


We next consider the case in which a wire, which makes a closed loop, 
moves with respect to a magnetic field. It takes no extra effort to con- 
sider the most general case in which the magnetic field also changes 
with time, so we will include both effects. The situation is depicted in 
Fig. 4.10, where we show the position of a loop C, representing a closed 
wire, at time ¢ and at time t + dt, and two corresponding surfaces, S(t) 
and S(t + ôt), that have C(t) and C(t + dt), respectively, as boundaries. 
We do not need to assume that the loop moves rigidly. The motion 
of the loop is determined by giving, at each point of C(t), the corre- 
sponding velocity v(t). The position of that point at time t+ ôt is then 
obtained by adding to it the vector v(t)dt, as shown in the figure, and 
v(t) is allowed to change from point to point of the loop (we should then 
write v[t,x(¢)] where £ is the curvilinear coordinate along the loop, see 
eq. (4.97), but we will keep the notation simple). 

The difference between the magnetic flux going through S(t + ôt) and 
that going through S(t) is given by 


d@p(t) = i ds- B(t + ôt, x) — I ds- B(t,x). (4.128) 
S(t+ôt) S(t) 


Working to first order in the infinitesimal quantity ôt, we can manipulate 
this as 


6 p(t) 


| ds- [B(t + ôt, x) — B(t,x)] (4.129) 
S(t+5t) 


+f ds- B(t,x) — f ds -B(t, x) 
S(t+ôt) S(t) 
OB 
= ot ds: — + ds: B(t, x) — ds: B(t,x) , 
S(t) ot S(t+ôt) S(t) 


where we first added and subtracted the same quantity f. s(t-+6t) ds - B(t, x) 


and then, in the first integral of the last line, to first order in ôt we could 
replace S(t+dt) by S(t), since this terms already has a factor ôt in front. 
We now consider the closed volume V bounded by the surfaces C(t) and 
C(t+ot), and by the lateral cylindrical region Sz swept by the loop when 


it evolves from C(t) to C(t + ôt). Observe that, on S(t + dt), the outer 
normal to OV is the same as the normal n(t + ôt) of C(t + ôt), while, on 
S(t), the outer normal to OV is the same as —f(t), see Fig. 4.10. Using 
the fact that V-B = 0, together with Gauss’s theorem, we have 


0 


II 


/ aV-B (4.130) 
Vv 


f ds: B(t, x) -f ds-B(t, x) +f ds: B(t, x), 
S(t+6t) S(t) SL 


where the minus sign in front of the second term is due to the fact 
that the outer normal of OV on S(t) is minus the normal n(t) shown 
in Fig. 4.10. We now observe, again from Fig. 4.10, that the surface 
element of the lateral surface Si, is given by 


ds = dé x (vot), (4.131) 


where d£ is the line element on C(t). Therefore, eq. (4.130) gives 


f ds: B(t, x) — f ds-B(t, x) = -öt f (dex) B(t, x) 
S(t+6t) S(t) G 
= -a f dé-(vxB), (4.132) 
C(t) 


where, more explicitly, v = v[t,x(¢)] and B = Bit, x(¢)]. Plugging this 
into eq. (4.129) and taking the limit ôt > 0, we get 


dèe = asf de-(vxB). (4.133) 


Finally, in the first integral we express 0B/Ot in terms of V xE using 
Faraday’s law (3.4) and we use Stokes’s theorem (1.38). This gives the 
final expression 


B Lọ del- (E+vxB). 
dt fo PETB] (4.134) 


Equation (4.134) shows that, for a moving loop, the electromotive force 
Eemf, that, for a static loop, we have already introduced after eq. (3.21), 
can be written as 


Cont = f d£- (E + vxB) , (4.135) 
C(t) 


and is made of two terms: the first is the electric field induced by the 
time derivative of the magnetic field, and the second (the “motional 
emf”) is due to the motion of the loop in the magnetic field. Note that 
the latter gives an electromotive force even in a static magnetic field. 

Observe that, on the right-hand side of eq. (4.134) we have obtained 
the combination of electric and magnetic fields, E+ vxB, that enters 
in the Lorentz force (3.5).°8 
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38 This, eventually, is dictated by 
the underlying Lorentz covariance of 
Maxwell’s equations. As we will see in 
Section 8.6.1, the combination E+vxB 
is the spatial component of a four- 
vector. 
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Fig. 4.11 The electric field gener- 
ated by a charged plane, and the 
cylinder through which we compute 
the flux of the electric field. The ar- 
rows denote the lines of the electric 
field. 


39As we have already seen in Sec- 
tion 4.2.1, there is a subtle limitation to 
this use of symmetry arguments, which 
is related to the phenomenon of spon- 
taneous symmetry breaking and is es- 
pecially important in particle physics 
and in condensed matter. In general, 
the symmetries of the problem (i.e., the 
symmetries of the equations that gov- 
ern the problem, including its geome- 
try and boundary conditions) are the 
same as the symmetries of the solution 
of these equations only when the solu- 
tion is unique. More generally, one can 
have a family of solutions, that are not 
invariant under the symmetry transfor- 
mation of the problem (in our example 
of an infinite plane, rotations around 
the z axis), but are transformed into 
each other by such transformations. A 
classic example is given by a ferromag- 
net: the fundamental equations govern- 
ing the interaction between elementary 
dipole moments in a ferromagnet are 
invariant under the full group of rota- 
tions in three dimensions. However, in 
its magnetized phase, in a ferromagnet 
the dipole moments are aligned along 
one specific direction, randomly chosen, 
so this solution is not invariant under 
rotations. The symmetry under rota- 
tions is now reflected in the fact that 
the ferromagnet can align itself along 
any direction, so there is a family of so- 
lutions, related to each other by rota- 
tions. However, the solution to electro- 
static problems is unique, as we have 
shown in Section 4.1.5, so in our case 
this use of symmetry principles is cor- 
rect. 


4.4 Solved problems 


In this section we collect, in the form of Solved Problems, a number of 
other rather classic applications of electrostatics and magnetostatics. 


Problem 4.1. Electric field of an infinite charged plane 


The integrated form of Maxwell’s equations, presented in Section 3.1.2, is 
particularly useful when the geometry of the problem has a high degree of 
symmetry. As an example, consider the electric field generated by a static 
charge density, distributed uniformly on a plane, idealized to be infinite in 
extent and with zero thickness. Let ø be the charge per unit surface on the 
plane. We orient the axes so that the charged plane coincides with the (x,y) 
plane, as in Fig. 4.11. For symmetry reasons, the electric field must then 
point in the +% directions and its modulus cannot depend on 2, y.°° Taking 
for definiteness a charge density ø > 0, so that the lines of electric field go out 
from the plane, we have 


E = +E(2)2, (4.136) 
where the plus sign holds for z > 0 and the minus sign for z < 0, and E(z) = 
|E(z)|. To compute E(z), we consider a volume V bounded by the cylinder 
shown in Fig. 4.11, with end-faces Sı = SZ and S2 = S' x (—%), both of area 
S, located at +z, respectively, and we use Gauss’s law in the integrated form 
(3.12). The charge inside the cylinder is Q = 0S. The fluxes through Sı and 
S2 are equal and add up, since, at Sı, both E and the outer normal to the 
surface are in the z direction, while at S2 they are both in the —z direction, so 


E(z)-S; = E(—z)-S2 = E(z)S, while there is no flux from the lateral surface 
of the cylinder. Then, eq. (3.12) gives 
2SE(2) = i (4.137) 
0 


so, in the end, the modulus of the electric field is independent even of z, and 


(4.138) 


Ba" 2. 
2€0 


It is instructive to repeat the computation by performing explicitly the inte- 
gration over the electric fields generated by the infinitesimal surface elements 
of the plane. We choose the coordinates so that the charged plane corresponds 
to z = 0, and we compute the electric field at a given point P with z > 0. We 
set the origin of the reference frame so that the coordinates of this point are 
(0,0, z) and, in the (x,y) plane, we use polar coordinates (p,p). We compute 
first the electric field generated in P by the charges in an infinitesimal ring, 
lying in the charged plane, and with radial coordinate between p and p + dp, 
while 0 < p < 27. Integrating over y, the total contribution to the compo- 
nents of E parallel to the plane, Es and Ey, vanishes, since the contribution 
from an infinitesimal surface pdpdy at a given value of ~ is canceled by the 
contribution from the infinitesimal surface on the other side of the ring, at 
p+r, so Ex = Ey = 0. This explicitly confirms the result obtained previously 
from symmetry arguments. Writing E- = |E|cos@, all the points in the ring, 
i.e., with the same value of p, contributes in the same way to the modulus, as 
well as to cos0, 


|E| = 


~ Amen p? +22’ 


1 apdpdy z 
= cos 9 = —~—.__.. . 
(p? + 22)1/2 


(4.139) 


Therefore, 
ie amz f a — E (4.140) 
Z ATeo ð P (p2 + 22)3/2 ` 
Passing to the integration variable u = p/z, we see that z cancels and 
o E u 
E, = d ; 4.141 
a= f ” az F 1372 ( ) 


The integral is elementary and equal to 1, so we recover eq. (4.138). 

We can now determine the corresponding gauge potentials ¢ and A. Ina 
problem of electrostatics the magnetic field vanishes, and we can set A = 0. 
Then, eq. (3.83) gives E = —V¢. In this case, again for the symmetry of the 
problem, ¢ cannot depend on (x,y), so ¢ = ¢(z) and Vo = (d¢/dz)z. Then, 
from eq. (4.138), 


dé _ _ o 
RE (4.142) 
so o 
$e) = do — Za, (4.143) 
€0 


where ġo is an arbitrary integration constant. The use of the integrated form 
of Maxwell’s equations (in this case, just the integrated form of Gauss’s law), 
for a problem with a high degree of symmetry, has allowed us to very quickly 
solve directly in terms of the electric field. It is instructive to compare this 
with the direct integration of Poisson’s equation (4.3). In our idealized setting 
of a charged plane with zero thickness, we have p(x) = 06(z). Since p(x) only 
depends on z, we search for a solution ¢ = ¢(z) and eq. (4.3) becomes 
do o 


oe =~ 2o). (4.144) 


From eq. (1.70) we see that the most general solution for the dp/dz is of the 


form 

ud = S (A(z) +a] , (4.145) 
with a a (dimensionless) integration constant. Quite commonly, in problems 
of electrostatics, integration constants are fixed by requiring that the electric 
field vanishes at infinite distance from the sources. Here, however, this is 
not possible, since we have considered an infinite plane and, with such an 
idealization, there is no guarantee that the electric field will vanish at z — -too. 
In fact, we already know from the solution (4.138) that this will not be the 
case. Rather, the symmetry of the problem requires that the modulus |d¢/dz| 
must be invariant under the parity transformation z + —z (we have in fact 
already used this condition when solving the problem using the integrated 
Gauss’s law). The function (z) is equal to 0 for z < 0 and to 1 for z > 0, so 
its absolute value is not an even function, but this is easily fixed by choosing 
the integration constant a = —1/2, since (z) — 1/2 is equal to —1/2 for z < 0 
and +1/2 for z > 0, so |@(z) — 1/2| is an even function. Then, eq. (4.145) 
becomes 


dọ 
dz €0 


ac) J , (4.146) 
which agrees with eq. (4.142). Observe also that, using the identity 

6(z) + 0(—z) =1, (4.147) 
we can also rewrite eq. (4.146) as 


do o 
dz 2€0 


(A(z) — (—z)] . (4.148) 


4.4 Solved problems 87 


88 Elementary applications of Maxwell’s equations 


dS\= r2dQ 


a>0 


dS2= r3dQ 


Fig. 4.12 The geometry for com- 
puting the contribution of the elec- 
tric field at the point P, due to the 
charges on two antipodal infinites- 
imal regions on the surface of the 
sphere. 


Problem 4.2. Electric field of a spherical charge distribution 


Consider now the electric field generated by a spherical charge distribution 
of radius d. In this case, by symmetry, the electric field will be in the radial 
direction and its modulus will only depend on r, so E = E(r)r. We take a 
spherical surface S of radius r > d and use eq. (3.12). Writing ds = r°dQt 
we get 


r(t) = [Pane Bie 
s 
= 4rrE(r), (4.149) 
and therefore, from eq. (3.12), 
1 Q 


— Aqeg r2’ 


E(r) (r >d), (4.150) 
where Q is the total charge of the sphere. This is the famous result (which is 
due to Newton, who found it for the gravitational case, but is valid for any 
force proportional to 1/ r°) that the field outside a uniformly charged sphere 
is the same as if all the charge (or, in the gravitational case, all the mass) of 
the sphere were concentrated at its center. 

If, instead, we take r < d, ®p is still given by eq. (4.149), but, on the right- 
hand side of eq. (3.12) the only contribution comes from the charge at r < d. 
In particular, if all the charge is on the surface of the sphere, the electric field 
at any point inside the sphere is zero! To understand how this comes out 
from the cancelation among the contribution of different charges, consider the 
setting of Fig. 4.12 in which, for graphical clarity, we only show a section of 
the sphere. We want to compute the electric field at a generic point P inside 
the sphere by summing over the contributions from the charges on the surface 
of the sphere. Consider first the charges in the infinitesimal region dS; which, 
with respect to the point P, subtends a solid angle dQ and is at a distance 
rı. Then, dS; = r7dQ. and, if the surface charge on the sphere is o (which is 
a constant, independent of the polar angles 0,¢, because of the assumption 
of spherical symmetry), the charge in dS; is orjdQ. Taking o > 0, we see 
from the figure that the electric field produced in P from the charge in dS} is 
directed radially and inward, toward the center O of the sphere. Therefore, it 
generates an electric field 


2 
dE, = 1 a ô) 
4neo ri 


1 , 
= Fe de. (4.151) 


The crucial point is that r? canceled between numerator and denominator. 
Consider now the contribution of the antipodal surface dS2, subtending the 
same solid angle dQ. Now, as we see from the figure, the contribution to the 
electric field is in the +r direction. We denote by r2 the distance of dS2 from 
P. Then 


1 ord, 
dE2 = H 
3 ÅTEo r2 (+ê) 
= + odQr. (4.152) 


Atreo 


This is equal and opposite to the contribution from dS,. Therefore, when 
integrating over the whole sphere, the contribution of each surface element is 
canceled by that of its antipodal surface elements, and we recover the result 
that E = 0 inside the sphere. 


Problem 4.3. Parallel-plate capacitor 


We next consider a parallel-plate capacitor, made of two parallel and op- 
positely charged infinite planes, separated by a distance d along the z axis, 
with surface densities ø > 0 at z = 0 and —o at z = d, as shown in Fig. 4.13. 
The electric field can be computed using the superposition principle, that we 
discussed in Section 4.1.1. The electric field of the parallel-plate capacitor is 
then immediately obtained from the result of Problem 4.1, just by summing 
the electric fields produced by the two plates. Taking into account that the 
modulus of the electric field of an infinite plane is independent of z and its 
direction changes sign on the two sides, and that we have taken the opposite 
signs for the surface charge densities on the two planes, we see that, for z < 0 
and for z > d, the fields produced by the two planes cancel out, while inside 
the capacitor they add up, so 


E= a, (0<z<d). (4.153) 
0 
The potential ¢ is therefore constant for z < 0 and z > d, while, inside, it 
satisfies 


dọ_ a 
J= (4.154) 
SO ë 
e(z) = ġo — ae (4.155) 


The absolute value of the difference in the potential between the charged plates 
is therefore 


V = |ġ(d) — ¢(0) 


If, rather than taking an infinite extent in the (x, y) plane, we take the charged 
plane to have a finite area A, the total charge Q on each plate is, in absolute 
value, Q = 0 A. We take A sufficiently large so that the effects near the finite 
boundary can be neglected, and the previous computation of the electric field 
still goes through, except near the boundaries. For a generic capacitor the 
capacitance C is defined by Q 


V , 
where Q is the charge of the positively charged plate, so Q > 0. Since V is 
the absolute value of the potential difference between the two conductors, C is 


positive by definition. Then, from eq. (4.156) we find that, for a parallel-plate 
capacitor, 


j=. (4.156) 
€0 


C= (4.157) 


C= ss A (4.158) 
From eq. (4.157), we see that capacitances are naturally measured in coulombs 
per volts, C/V, and this derived SI unit is called the farad (F), in honor of 
Michael Faraday. In terms of the base units, 1 F = 1C?s?/(kgm?).*° The 
farad, however, turns out to be an unreasonably large unit, in practice. Typical 
values for capacitors are of order picofarad (pF = LO F) to microfarad 
(uF = 107° F). Note also that the combination C?/(N m°), that gives the 
units of €9, is the same as farads/meter; we see this, for instance, from the 
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Fig. 4.13 The electric field gener- 
ated by a parallel plate capacitor. 


40 As mentioned in Note 7 on page 30, 
we sometimes use the coulomb instead 
of the ampere as a basic unit. 
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fact that C x V = N x m, since, dimensionally, they are both energies, and 


therefore 
C? c Nm 1 
N m? V N m2 


slagje 


(4.159 


We could also have seen this directly from eq. (4.158). Therefore, eq. (2.13 
can be rewritten as 


F 
eo © 8.854... x 10° —. (4.160 
m 


From this we see why the picofarad is a more appropriate unit than the farad. 
For a plane parallel capacitor filled with vacuum, as in the example that we 
have considered, taking for instance A ~ 1cm? and d ~ 1mm, from eq. (4.158 
we get C of order of a pF. 

The force exerted by a plate on the other is attractive, given that the two 
plates are oppositely charged. Its modulus can be obtained from eq. (4.138), 
which gives the electric field generated by one plate, multiplying it by the 
absolute value Q of the charge on the other plate, so 


p- 


~ 2e07 


(4.161) 


We can re-express it in terms of the total electric field Æ between the two 
plates, given in eq. (4.153), which is twice as large as the field generated by 
each plate, so 


F= SOE. (4.162) 


Typical circuits contain a large number of capacitors. The linearity of Maxwell’s 
equation implies that the total charges Q, on the capacitors (where a = 
1,..., N labels the capacitor) and the potentials ġa on their surfaces are re- 
lated linearly, 


N 
Qa = > Crate: (4.163) 
b=1 
The coefficients Ca, depend only on the geometry of the system, i.e., the shape 
of the conductors and their relative arrangement. They are called the coef- 
ficients of capacitance, and form the capacitance matrix C. The off-diagonal 
elements Cab, with a Æ b, are called the mutual (or cross) capacitances, while 
the diagonal elements Caa are the (self) capacitances. Their values are not 
the same as the value of the capacitance of the a-th conductors in the absence 
of all other. Indeed, even when all ¢» with b ¥ a are set to zero (i.e., the 
conductors are “grounded” ), there are charges on their surfaces that influence 
the potential on the surface of the a-th conductor. In Problem 5.2, we will 
prove some useful properties obeyed by the coefficients Cab. 
The inverse matrix C7} is often denoted by P, so 


N 
pa = > Pade, (4.164) 


b=1 


and Pa» are called the coefficients of potential. 


Problem 4.4. Spherical capacitor 


We now consider a spherical capacitor, made of two concentric spherical 
shells of radius a and b, with a > b, and charges —Q and Q, respectively. 
Again, by symmetry, the electric field is radial, E = E(r)f. Applying the 
integrated form of Gauss’s law, eq. (3.12), to a spherical volume with a radius 
r < b we find that the flux is zero because there is no charge inside the 
volume, and therefore E = 0 at r < b. Similarly, at r > a, the charges of the 
two spherical shells compensate each other and the total charge is Q — Q = 0, 
so again eq. (3.12) gives E = 0. The field between the two shells can be 
computed using eq. (3.12) where the volume V used in the integration is now 
a sphere of radius r, with b < r < a. The total charge inside this volume is 
just the charge Q of the inner spherical shell, while the external spherical shell 
has no effect so, as in eq. (4.150), 


E(r)= — 2 


= reg 2” (b<r<a). 


(4.165) 
The potential difference between the two shells is therefore, in absolute value, 
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(4.166) 


and the capacitance is 


(4.167) 


Comparing with the result (4.158), we see that we still have at denominator 
the distance d = a—b between the two elements of the capacitor, while the area 
factor A in eq. (4.158) is replaced by 4rab in eq. (4.167). In the limit b > a, 
4rab tends to the area A of the sphere, consistently with the fact that, if the 
radii of curvature of the spheres are much larger than their distance d = a — b, 
locally, the geometry is the same as that of a parallel plane capacitor. 


Problem 4.5. Electrostatic energy of an ionic crystal 


We now discuss the electrostatic energy of an ionic crystal, such as NaCl, 
where the positively charged Nat ions and the negatively charged Cl” ions 
are arranged in a cubic lattice, with lattice spacing a (so that each ion has six 
nearest neighbors with opposite charge, at a distance a). To compute it, we 
can select one given ion, say of Na*, and compute its interaction energy with 
all other Na* and Cl~ ions. This gives the interaction energy per unit ion, 
u. Since the computation performed choosing any of the N/2 positive ions or 
any of the N/2 negative ions gives the same result, the total energy potential 
energy is then obtained multiplying this result by N, and then dividing by 2, 
since in this way each pair has been counted twice, so U = uN/2. 

We set the origin of the reference frame on the chosen ion. The positions 
of the ions, in a regular cubic lattice of spacing a, are given by x = an = 
a(nz,Ny,nz) with Ng, Ny, nz integers running from zero to infinity, so their 
distance from the origin is a(n? + ne + n2)". The charge of the ion at the 
position an is (—1)"*+"¥+"=e."1 Therefore, the potential energy per unit ion 
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“IThis can be checked setting at first 
ny = nz = 0 and moving along the x 
axis. At nz = 0 we have our chosen 
Nat ion, with charge +e; for nz even 
we find again positively charges ions, 
and for nz odd negatively charged ions. 
In the line defined by, say, ny = 1,nz = 
0, the situation is inverted. Just on top 
of our chosen ion, at nz = 0, we find 
a negative charge, at nz = 1 a positive 
charge, and so on. 
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42 Actually, there are further mathe- 
matical subtleties here. If one uses a 
series of expanding cubes the sum con- 
verges while, if one uses a series of ex- 
panding spheres, it can be proven that 
the sum diverges. In fact, the prob- 
lem is quite interesting and challeng- 
ing from a mathematical point of view. 
The rigorous definition of the sum in 
eq. (4.168) is obtained from the ana- 
lytic continuation to s = 1 of the series 
(—1)Petrytre 
2 (n3 +n? + n2)s/? 

which is absolutely convergent for 
Re(s) > 1, and the sum over expanding 
cubes converges to the correct result; 
see the discussion in Section 3 of Bai- 
ley, Borwein, Kapoor, and Weisstein 
(2006), https://www.davidhbailey. 
com//dhbpapers/tenproblems. pdf. 


of an ionic crystal is given by 
(—1)"ettutre 


a 
ATEo a ren (n2 + n2 +n2)i/2° 


(4.168) 


The sum in eq. (4.170) is called the Madelung’s sum. Here, however, we 
encounter a problem. This series is not absolutely convergent, i.e., its con- 
vergence (and its finite value in the case that it converges) depends on the 
order in which we sum the terms. For instance, if we would first sum up all 
the terms with nz + ny + nz even, we would get a divergent result, since a 
sum of terms, all with the same sign, and that goes asymptotically as )7,, 1/n, 
diverges. This mathematical problem reflects a physical ambiguity. The sum 
can be organized moving outward from our chosen ion, and including in the 
sum the interaction with the ions which belong to larger and larger volumes, 
with our chosen ion near their center. One possibility is to choose these vol- 
umes such that they all contain a zero net charge. Another option is to choose 
them so that, near their boundaries, they are deformed so as to include only 
ions of a certain type, e.g., positively charged. Choosing different sequences 
of volumes, that include or exclude ions at given positions on the boundary, 
we can obtain any desired distribution of surface charges on this sequence of 
volumes. Such a distribution of surface charges can, in general, have a fi- 
nite or even a divergent interaction energy with the chosen ion, even when 
the boundary is at a very large distance r from it, since the contribution to 
the sum from the ions at a large distance r decreases as 1/r, but, taking for 
instance the extreme case of a constant surface charge (i.e., the situation in 
which the boundary is deformed so as to include only ions of a given sign), 
the total charge on the surface grows as r?; in this case, the interaction of 
the surface charges with the chosen ions diverges as r — oo. With different 
organizations of the sequence of volumes used in the sum, corresponding to 
different surface charge distributions, we can obtain different results, finite or 
divergent; this is the physical reason why the result depends on how the sum 
in eq. (4.168) is organized. For computing the electrostatic energy of a crystal, 
we are interested in the situation in which there is no surface charge; therefore, 
the sum must be organized through a series of electrically neutral subsequent 
volumes. Using a series of expanding and electrically neutral cubes, one can 
show that the sum converges, and can be computed numerically.*? The result 
can be written as 


_ A e 

7 Ate a 
where A œ 1.7476 is the Madelung’s constant for a cubic lattice, such as that 
of NaCl. Equivalently, using U = uN/2, the total potential energy of a crystal 


with N ions is 


(4.169) 


A e 
8TEo a 
We can compare the value of A obtained from this procedure with that ob- 
tained starting from a given Na ion and computing its interaction energy with 
the 6 nearest neighbor Cl ions at distance a, the 12 Na ions at distance av2 
and the 8 Cl ions at distance av3 (note that this configuration is not electri- 
cally neutral). This gives 


U=-N 


(4.170) 


1 è 12 8 
w= 64 Bissell 4.171 
Ate a V2 V3 ( ) 
and therefore in this approximation A = 6 — (12/v2) + (8/V3) +... = 2.13, 


to be compared with the correct value A ~ 1.7476. 


Two aspects of eq. (4.170) are noteworthy. First, the sum gives a negative 
result for U. This means that the attraction between opposite charges wins 
over the repulsion between charges of the same sign, resulting in a bound 
state. This is a welcome result, since it goes in the direction of explaining the 
cohesion of a crystal. However, we can now ask what fixes the value of the 
lattice spacing a. From eq. (4.169), the smaller is a, the more negative is the 
potential energy, and the state with the largest binding energy is obtained for 
a => 0+. Therefore, if eq. (4.169) was the end of the story, the crystal could 
lower its energy by decreasing a indefinitely and would therefore collapse. This 
result, however, is not unexpected: we have already found in Section 4.1.4 
that no set of electric charges can be in electrostatic equilibrium. Once again, 
at atomic scales quantum mechanics must come to the rescue to ensure the 
stability of matter.** 

Indeed, quantum effects induce an effective repulsion between the core elec- 
trons of the Na and Cl ions, that counterbalances the tendency of the elec- 
trostatic potential to induce the collapse of the lattice. Without entering into 
details beyond the scope of this course, all we need to know here is that, phe- 
nomenologically, this quantum repulsion can be described by a potential per 
unit ion of the form B/r”, for some positive constants B and a power index 
n > 1 (with n œ 8 for NaCl).** The equilibrium value of the lattice spacing is 
then determined minimizing, with respect to r, the total potential energy per 
unit ion, 


FC) ea E 

Areo r re’ 
Requiring du/dr = 0 fixes the equilibrium value of r, which we denote by a, 
from the relation 


(4.172) 


A ezqr-t 
B= 4.173 
Areo å n ’ ( ) 
so the lattice spacing a is fixed in terms of A, B, and n. If we use this relation 


to eliminate B from eq. (4.172), we find that the potential per unit ion is 


A e 1 1 
Ate9 a n)- 


The term 1/n therefore gives a quantum correction to the binding energy. Us- 
ing the numerical values of e and 1/(47re) from eqs. (2.4) and (2.11), together 
with the NaCl values A œ 1.7476, n ~ 8 and a ~ 0.281 nm, gives a dissociation 
energy per unit molecule (i.e., the energy per unit pair of NaCl that must be 
provided to dissociate the crystal) waiss = —u ~ 1.255 x 107!8 J. The energy 
needed to dissociate a mole of NaCl is then 


Udiss = 


(4.174) 


u = 


Udiss NA 


756 kJ/mol, (4.175) 


where Na is the Avogadro number (2.3). Using the conversion factor 1 kJ ~ 
0.239 kcal, we can also rewrite it as Uaiss ~ 181 kcal/mol. Alternatively, a 
convenient unit for atomic physics is the electronvolt (eV), defined by the 
exact relation*® 


leV = 1.602176634 x 107" J. (4.176) 
Then, the dissociation energy per unit pair of NaCl can be written as 
Udiss X 7.84eV. (4.177) 
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43 Actually, the result that, for this 
problem, no stable result is possible in 
classical electrodynamics, can already 
be derived just with dimensional anal- 
ysis. The electrostatic energy per ion 
pair must be proportional to the prod- 
uct of their charges and therefore, in 
SI units, to e?/(4me9). The only way 
to obtain an energy from this quan- 
tity is to divide it by a length, and in 
this problem we have only one length- 
scale available, the lattice spacing a. 
Therefore, the result must necessarily 
be of the form (4.169), for some numer- 
ical constant A, positive or negative. 
If A > 0, as is actually the case, the 
energy is minimized for a — 0+, and 
the lattice collapse. If one had found 
A < 0, u would have been given by a 
positive constant times 1/a, and this 
is minimized for a — oo, so in this 
case repulsion wins and the lattice “ex- 
plodes,” with its ions dispersing at in- 
finite distance from each other. Quan- 
tum mechanics is able to solve the prob- 
lem because it has at its disposal an- 
other fundamental constant, the Planck 
constant h, and with it, together with 
the mass m of a particle (or, here, 
of an ion), we can form the combi- 
nation h/(mc), which has dimensions 
of length. Therefore, we have another 
length-scale at our disposal, which al- 
lows for more complicated functional 
forms of the potential, such as that in 
eq. (4.172) (note, in fact, that B there 
is not a pure number, contrary to A). 


44 Por the reader with some elementary 
knowledge of quantum mechanics, the 
mechanism that stabilizes the system 
is actually the Pauli exclusion princi- 
ple: when two ions get too close, the 
core electrons of an ion begin to feel 
the presence of the core electrons of 
the other ion and the Pauli principle 
(more precisely, the antisymmetriza- 
tion of the wave functions of these elec- 
trons) provides an effective repulsion 
between them. 


45Note the relation with the absolute 
value of the electron charge, e, given in 
eq. (2.4): the eV is the energy that a 
charge e acquires by going through a 
potential difference of 1 V. 
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Fig. 4.14 A solenoid, made by a wire 
winding along a cylinder, and the 
contour C used for the integrated 
Ampère law. 


Fig. 4.15 As in Fig. 4.14, for a wire 
that winds very tightly around the 
cylinder, so that j = j@. 


Problem 4.6. Magnetic field of a straight solenoid 


We next compute the magnetic field of an infinite straight solenoid, i.e., of 
a wire carrying a current, that winds around an infinitely long cylinder, as in 
Fig. 4.14. We proceed similarly to the computation for an infinite straight wire 
in Section 4.2.1. We use again cylindrical coordinates (p,y,z). We assume 
that the wire winds very tightly around the cylinder, so that it practically 
defines a surface current density j = jp, see Fig. 4.15. Then, the problem is 
invariant under translations along the z axis and rotations around the cylinder, 
and therefore B(p, y, z) must again be of the form (4.72), as was the case for 
an infinitely long straight wire. 

The radial component B, vanishes, by essentially the same argument used 
in Section 4.2.1: we denote the radius of the solenoid by a and we consider a 
cylinder of radius R > a, such as that in Fig. 4.5, which encloses the solenoid 
(rather than enclosing a wire, as was the case in Fig. 4.5). The flux through 
the surface of the cylinder must vanish, by eq. (4.71). Since B+ is the same 
at the faces of the cylinder at z = +h/2 (because of the invariance of the 
problem under translations along z), the flux from the lower face cancels that 
from the upper face, and then the flux from the lateral surface of the cylinder 
must also vanish, and this implies B, = 0. This remains true if we put this 
cylindrical volume inside the solenoid, i.e., if we take R < a. Therefore, B, 
vanishes both inside and outside the solenoid. 

Similarly, taking a loop C in the plane transverse to the cylinder, such as 
that in Fig. 4.4 (where, again, now the loop encloses the solenoid rather than 
a wire) and of radius p > a, the left-hand side of eq. (4.70) is 27pBy,, while 
the right-hand side vanishes, because for the (tightly winding) solenoid the 
current is uniquely in the @ direction, and has no 2 component, so there is 
no current flowing through a surface bounded by C; the same holds for p < a, 
and therefore B, = 0. This means that, in this problem, 


B(p, ¢, z) = Bz(p)z. (4.178) 


This could have also be shown using symmetry arguments, similarly to what 
we did in Section 4.2.1 for an infinite wire: we begin by considering a parity 
transformation x — —x, denoted by II. Under it, a point on the surface of 
the cylinder, say at x = (0,a, z) (where the axes are oriented as in Fig. 4.6), 
is sent into the antipodal point x’ = (0,—a,—z), which is still on the surface 
of the cylinder. Under this transformation, j transforms as j(x) => —j(x’). 
So, if at the point x the current was flowing inward with respect to the plane 
of the page in Fig. 4.15, after the transformation the current at x’ is flowing 
outward from the plane of the page. This is indeed how the current at the 
antipodal point actually flows in Fig. 4.15, so the configuration in Fig. 4.15 
is invariant under parity. We next study how B(x) transforms, for a generic 
starting point P with coordinates x. After this parity transformation, the 
situation is exactly the same as for the transformation from the point P to 
P’ in Fig. 4.6. Similarly to what we did in the discussion of Fig. 4.6, we 
then perform further symmetry transformations that bring back the point P’ 
onto P. In this case, a convenient choice is to perform first a rotation by 180° 
around the z axis, that we denote by Rz, that brings P’ to a point P” with 
coordinates x” = (0,a,—z), and finally a translation T, along the z axis that 
brings it back to P. Under Rz, Bz is invariant while B and By change sign, 
while under the translation all components are invariant. So, in the end, the 
combined transformation T, R-II is a symmetry transformation of the system 


and, under it, B,(x) > —B,(x), B,(x) > —B,(x), and B.(x) > +B-,(x). 
Therefore, by the same chain of arguments discussed in Section 4.2.1 (including 
a choice of boundary conditions that respect this symmetry), both Bs and By 
(and therefore B, and B,) must vanish and, in this setting, only B, is non- 
zero, confirming the result that we found by a direct use of the integrated 
Maxwell’s equations.*° 

So, either from a direct use of Maxwell’s equations, or from symmetry 
arguments, the magnetic field of an infinite straight solenoid has the form 
(4.178). We then plug it into Ampére’s law (4.67). Using the curl in cylindrical 
coordinates, eq. (1.33), we get 


ð B- = — uoj. (4.179) 


Outside the solenoid, i.e., for p > a, we have no current, so j = 0 and 0, B- = 0. 
The boundary condition Bz = 0 at p —> o then fixes B, = 0. Inside the 
solenoid, at p < a, again j = 0 and B, is constant. However, we cannot 
appeal to continuity across the solenoid to fix this constant to zero, because 
of the singular current density that is present there. Rather, we take a loop C 
such as that in Fig. 4.15, and we apply the integrated Ampére-law (4.70), 


f dl- B(x) = mols , (4.180) 
Č 


where we denote by Is the total current flowing through the surface S bounded 
by C. If we denote by J the current carried by the wire, by n the number of 
loops of the wire per unit length, and by L the vertical length of the loop C, 
we have Is = nLI. The portions of C in the radial direction do not contribute 
to the integral in eq. (4.180) because there dl- B = dp p- B = dpB,, and we 
have seen that B, = 0 everywhere. Similarly, the portion of the loop in the 
2 direction outside the solenoid does not contribute because, there, Bz = 0. 
The only contribution therefore comes from the part of C along the z direction 
and inside the solenoid, and there 


[Be f° dz B.(p) 


L/2 
= LB.(p). 


(4.181) 


Equation (4.180) therefore gives LB.(p) = uonLI, so Bz is actually also inde- 
pendent of p and given by Bz = onI. In conclusion, in an infinite cylindrical 
solenoid the magnetic field is non-vanishing only inside and, there, it is ori- 
ented along the axis of the solenoid and is spatially uniform, 


B = ponlz. (4.182) 


Problem 4.7. Energy dissipation in a conducting wire 


We now consider a resistive wire, i.e., a wire where the relation between the 
current density and an applied electric field is given by Ohm’s law,*” 


j=oE, (4.183) 


where ø (not to be confused with a surface charge density) is a proportionality 
constant called the conductivity. We will discuss Ohm’s law and its microscopic 
justification (as well as its frequency-dependent generalization) in Section 14.4; 
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46We could reach the same conclusion 
by a combination of time reversal and 
rotations. The transformation of j and 
B under time reversal was given in 
eqs. (3.97) and (3.98). For time inde- 
pendent fields, this reduces to B(x) > 
—B(x) and j(x) — —j(x) and cor- 
responds to the fact that eqs. (4.67) 
and (4.68) are invariant under a trans- 
formation that flips simultaneously the 
signs of B and j, without touching the 
spatial coordinates. We can now ob- 
serve that, under a rotation by 180° 
around the x axis (denoted again by 
Rz), the solenoid in Fig. 4.15 is geo- 
metrically unchanged, except that now 
j runs clockwise rather than counter- 
clockwise. This, however, can be com- 
pensated by a time reversal transforma- 
tion T. Therefore, TRg is a symmetry 
of the system and, proceeding similarly 
to what we did previously, one finds 
that, under this transformation (fol- 
lowed by a translation along the z axis 
to get back to the original point P with 
coordinate x), Bz (x) + —Bz(x), while 
By,(x) > By(x) and Bz(x) > Bz(x). 
Then, with the by now usual chain 
of arguments, Bx = 0. In the same 
way, using the transformation T Ry, one 
finds By = 0. So, again, we find that 
only Bz is non-zero. 


47% Section 4.1.6 we proved that, in- 
side a conductor, E = 0. This, how- 
ever, was shown for static charges at 
equilibrium, i.e., when a conductor, iso- 
lated and subject only to a static ex- 
ternal electric field, has rearranged its 
surface charges and reached an equilib- 
rium situation, where the external field 
is screened and j = 0. Here, we are in- 
terested precisely in the opposite situa- 
tion where, using for instance a battery, 
a potential difference is kept among two 
points of a conducting wire, driving a 
steady current inside it. 
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48 This, of course, can be seen also as 
a consequence of Gauss’s law, given 
that the overall charge density inside 
the wire is zero. In the wire, the ions 
are fixed while the electrons drift with 
the current. However, consider a mi- 
croscopic volume dV = Adl, where A 
is the transverse size of the wire and 
dl a line element along the wire (here 
dV is small compared to macroscopic 
scales, but still sufficiently large to al- 
low us to perform an average over many 
ions and electrons, as necessary to de- 
fine “macroscopic” quantities such as p 
and j, smoothing out fluctuations on 
atomic scale; we will discuss these av- 
eraging procedures in detail in Chap- 
ter 13). The positive charge of the ions 
present in the volume dV is always com- 
pensated by the negative charges of the 
electrons that, at the given time, hap- 
pens to be in dV. The individual elec- 
trons will continuously flow out of dV 
from one side but, in a steady current, 
they will be replaced by an equal flux 
of electrons that enter dV from the op- 
posite side. 


49Instead of the conductivity ø, it is 
common to use the resistivity p = 1/0 
(again, not to be confused with a charge 
density!) Then, for a thin wire, 
es 
A 

We see that, dimensionally, p is the 
same as a resistance times a length and 
is normally given in the derived SI units 
of Qm. For instance, for copper at 20°, 
p ~ 1.68 x 10-8Qm. Conductivities 
are then given in (Qm)~!. In the SI 
system, the inverse of the ohm is called 
the siemens (S), so 1 QTI = 1 S. In 
the SI system, conductivities are then 
measured in siemens per meter, S/m. 
In terms of the fundamental SI units, 
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(4.187) 


1sS=1 


(4.188) 


for the moment, we just take it as a simple phenomenological relation which, 
observationally, turns out to have a very broad range of applicability. Note 
that, for a steady current, in the continuity equation (3.22) we have 0p/dt = 0, 
and therefore, V-j = 0. From eq. (4.183), it then follows that also 


V-E=0, (4.184) 


so, even if, in this case, E Æ 0 inside a conductor, still its divergence is zero.** 


From the discussion in eqs. (4.96)—(4.99) we know that V-j = 0 implies that 
j is uniform along the wire and therefore the same holds for E. 

If we apply a potential difference Vap between two points a and b of the 
wire at a distance L apart, this will induce a current J, and the resistance 
Rav between these two points is defined by Vap = Rav, or (leaving henceforth 
implicit the labels a, b) 

V=RI, (4.185) 


which is the most elementary version of Ohm’s law. The resistance R is 
determined by the conductivity o, the geometry of the wire, and the distance 
between the points a,b. For example, taking for simplicity a wire of cross- 
sectional area A, with j uniform inside it, we have J = jA. On the other hand, 
we have seen that the electric field is uniform along the wire so, if it produces 
a potential difference V among two points a,b at distance d, its modulus is 
E = V/d. Then, multiplying eq. (4.183) by A, we get I = cAE = (o A/d)V 
which shows that, in this geometry, 
d 
R= A 

From eq. (4.185), R has dimensions of volts per ampere, V/A. This derived 
SI unit is called the ohm (Q).*° 

As we have mentioned before, in the absence on an external agent such as 
a battery, the current in the conductor would quickly set to the equilibrium 
value j = 0. The mechanism that damps any initial current is given by the 
collisions of the electrons, accelerated by the external field, against the fixed 
ions, so that the kinetic energy of the electrons is dissipated into heat (“Joule 
heating”). We will study this mechanism with a simple model in Section 14.4. 
In any case, to keep a steady current going, we need to supply continuously 
energy to the system, for instance with a battery. The corresponding work 
per unit time can be computed observing that, if we take two points a and b 
on the wire, with potential difference V, the work done to transfer a charge 
dq from a to b is dW = Vdg. In a steady current I, a charge dq = Idt is 
transferred in a time dt, so the required work is dW = VIdt. Then, using 
V = RI (where, again, V = Va» and R = Ra» are, respectively, the potential 
and the resistance between the points a and b but, according to standard use, 
we suppress the labels a,b), we get 


dW 2_ V? 

FT VI= RI R` 
This is the energy per unit time that must be supplied to keep a steady current 
going, to compensate the energy dissipated into heat in the wire. 

It is instructive to compare this result with the flow of electromagnetic 
energy into the wire, obtained from the Poynting vector. Observe that, outside 
the wire, E = 0 and the Poynting vector vanishes. Inside, however, both E 
and B are non-vanishing. In particular, the flow of energy into the wire can 
be computed from the Poynting vector at the wire surface. We take the wire 


(4.186) 


(4.189) 


as circular, of radius a. Using cylindrical coordinates with the z axis along 
the wire, the magnetic field is given by eq. (4.79), 


et e (4.190) 


E Ona’ 


where I = jra?. The electric field at the wire surface is obtained from 


eq. (4.183) with j = jz and j = I /(mta?), so 
I 


E(p = a) = zê (4.191) 
Therefore, using eq. (3.34) together with 2x@ = —p, we get”? 
re 
S=-> Pp, (4.194) 


Note that the energy flux points toward the wire, so the electromagnetic field 
at the exterior of the wire feeds energy into the wire. The energy per unit time 
that flows into a portion of wire of length L is obtained from the right-hand 
side of Poynting theorem (3.35), taking as volume V a cylinder of length L 
along the z axis and radius a. Its lateral surface element is dz ady p, so 


I? L 27 
PL 


= —_— 4.195 
= (4.195) 
where A = ma”. From eq. (4.186), the resistance between two points at dis- 
tance L is R = L/(aA), so we see that 


-f ds-S = RI’. 
OV 


Therefore, the flow of energy from the electromagnetic field into the wire 
balances the losses due to dissipation in the wire, providing the continuous 
inflow of energy necessary to maintain a steady current. Note that the flow of 
energy is in the radial direction —p, even if eventually the flow of the current 
is in ĉ direction. 

As discussed in Note 11 on page 47, one can find different expressions for 
the energy density and Poynting vector, that give the same integrated energy 


(4.196) 


conservation law. In particular for a static problem, where E = —V¢ and 
V xB = uoj, we can rewrite the Poynting vector as°* 
S= ġj. (4.197) 


So, in this case the energy that flows into a portion of wire of length L, 
obtained as before taking as volume V a cylinder of length L along the z axis 
and radius a, now appears to enters from the lower face of the cylinder and 
flow out from the upper face (with a net difference between incoming and 
outgoing flow due to the fact that the potential ¢ in eq. (4.197) grows linearly, 
in absolute value, along the wire), rather than from the lateral faces of the 
cylinder, as in eq. (4.194). 

In any case, the important notion is that the energy is transferred to the 
wire by the surrounding electromagnetic field, and energy conservation, as 
derived from Maxwell’s equations, is fundamentally an integrated relation. 
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50 This can be rewritten as 


S= = aip =a)ap. 
Observe that, if j is uniform across the 
wire, the same argument, taking as vol- 
ume V acylinder of length L along the 
z axis and radius p < a, shows that, 
inside the wire, 


oO 
S= -37 ØP, 


(4.192) 


(4.193) 


where p = pp. 


5l Explicitly, 


S = 


= -+[V x ($B) - ¢Y xB] 
Ho 
= oA [V x (B) dug) 
Ho 
1 
= £2 x (4B). 
Lo 


The term V x (¢B) does not con- 
tribute to the integrated conservation 
equation, since its integral over a closed 
boundary OV vanishes. It corresponds 
to the freedom of adding to S a term 
V x w, for an arbitrary vector field w 
(see Note 11 on page 47). We can there- 
fore drop it, and use eq. (4.197) for the 
Poynting vector. 


98 Elementary applications of Maxwell’s equations 


Problem 4.8. Inductance of a circuit 


Consider a set of closed loops Ca, (a = 1,...,N), carrying currents Iq. 
These currents generate a magnetic field, and we denote the flux of the total 
magnetic field through the a-th loop by ®p,c, 


Ppa = f ds-B, (4.198) 
where Sa is any surface having Ca as the boundary. The linearity of Maxwell’s 
equation implies that the relations between the currents and the flux is linear, 
i.e., 


N 
OBa =Y Lal. (4.199) 
a=1 
The coefficients Lay, with a Æ b, are called the mutual inductances, while the 
diagonal element La = Laa is the self-inductance (or, simply, the inductance) 
of the a-th loop. In Section 5.3, we will prove explicitly eq. (4.199), and we 
will show how to compute the mutual inductances in terms of the shapes and 
positions of the loops. From eq. (4.199), we see that the dimension of Lap, in 
SI units, are the same as Tm?/A. Comparing eqs. (2.19) and (2.20) we see 
that this is the same as Vs/A. This derived SI unit for inductance is called 
the henry (H). 
If we have just a single loop, we can omit the index a and write, more 
simply, 
p= LDE; (4.200) 
As an application, consider the situation in which we have a single loop; the 
current in the loop is initially zero and the loop is connected to a battery, 
which is suddenly switched on at time t = 0. We denote by Vo the electro- 
motive force (emf) provided by the battery. This emf drives a current in the 
loop, so I(t) raises with time. According to eq. (4.200), the time evolution 
of the current induces a time evolution of the flux ®g. In turn, according to 
Faraday’s law (3.21), this induces an electromotive force —d® g /dt. Then the 
total electromotive force in the loop is 


dI 
Eemt = Vo — La i (4.201) 
The relative minus sign is the content of Lenz’s law, as we have seen in Sec- 
tion 4.3.1, and is such that the induced electromotive force opposes the change 
in the flux produced by the external source. If the loop has a resistance R, 
then Ohm’s law (4.185) states that Eemt = RI. Therefore, the equation that 


governs the time evolution of the current, after the battery is switched on, is 


Vo— L— = RI 4.202 
0 dt ? 
i.e., F 7 
= 
T +I= R’ 4.203 


where Tr = L/R. The solution, with the initial condition I(t = 0) = 0, is 


ð= a (1 = eu") 4.204 


Therefore, 7 is the timescale on which the current raises to its asymptotic 
value Vo/R. For a given value of R, the larger is L, the slower is the raise of 
the current. 


Electromagnetic energy 


As we have seen in Section 3.2.2, the electromagnetic field carries energy. 
In this chapter, we will see that the energy stored in a static electric field 
is equal to the work that has been done to assemble the configuration 
of charges that generates it (although, for point charges, this requires 
to deal with some “self-energy” divergences) and, similarly, the energy 
stored in a static magnetic field is equal to the work needed to pro- 
duce the configuration of currents that generates it. We will also show 
how to rewrite the electromagnetic energy in different useful forms, in 
electrostatics and magnetostatics. 

Another important aspect is the relation between the electromag- 
netic energy and the “mechanical potentials” U, from which (for non- 
relativistic systems) the electromagnetic force acting on the sources can 
be derived, as F = —VU. We will see that in electrostatics, in par- 
ticular when dealing with a set of conductors, to compute the forces 
acting on them we must distinguish between a mechanical potential at 
fixed charges, and a mechanical potential computed keeping fixed the 
electrostatic potentials on their surfaces. A similar issue arises in mag- 
netostatics. 


5.1 Work and energy in electrostatics 


In this section, we evaluate the energy of a system of point-like charges 
da; at given positions Xa, using the mechanical definition of energy of a 
system as the work that should be done, by an external agent, to build 
the desired configuration. We will then consider the generalization to 
continuous charge distributions. In Section 5.2, we will compare with 
the energy stored in the corresponding electromagnetic field. 

We consider first a system of two charges, qı and q2, and we compute 
the work done by an external agent to bring the second charge from 
infinity to a final position x2, in the potential generated by the first 
charge, which is located at a fixed position x;.! To compute the work 
done, we assume that the position of the first charge is nailed down in 
X1, so that it does not move. Then the work is given by eq. (4.33), 
where, from eq. (4.6), the potential generated by the first charge is 


acte oe 
4reo |x — xıl ` 


p(x) (5.1) 


If the charge comes from an initial position xg with |xọ| > oo, we have 
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l Observe that we are interest in build- 


ing 


a final static configuration of 


charges by bringing them into the de- 
sired position from infinity. We can 
imagine doing it very slowly, so that 


the 
the 


velocities of the particles during 
whole process are infinitesimally 


small, and we can then apply the non- 
relativistic notions of force and work 
to obtain the exact energy required to 
build the final field configuration. 
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2 As a check of the sign, we observe that 
the work needed to bring closer to each 
other two charges of the same sign must 
be positive, to overcome their repul- 
sion, and this is correctly reproduced 
by eq. (5.2). Also note that, as dis- 
cussed after eq. (4.31), the result is in- 
dependent of the path used to bring the 
charge q2 from infinity to the position 
X2. 


31 is often remarked that we are us- 
ing here the superposition principle, 
that states the the electric and mag- 
netic fields generated by an ensemble 
of charges with pre-assigned positions 
or trajectories is the (vector) sum of 
the fields generated by the individual 
charges. However, as we discussed in 
Section 4.1.1, in classical electrodynam- 
ics this is not a separate principle, but 
just a consequence of the linearity of 
Maxwell’s equations. At the quantum 
level, there can be effects that gener- 
ate non-linearities, which will manifest 
at microscopic scales or for sufficiently 
strong fields. 


(xo) = 0, SO, 


we = q(x) 
1 1492 


At €9 |xo — xıl $ 


(5.2) 


where the superscript (2) reminds us that this is the work made to bring 
the charge q2 at the specified position.? 

We now keep the positions x; and x2 of these two charges fixed, and 
we compute the work needed to bring a third charge, q3, from infinity to 
a given position x3. We therefore use again eq. (4.33), where now ¢(x) 
is the potential generated by the first two charges,’ 


1 a @ 
4reo ||x—x1| |x- Xo] | | 


The work needed to bring the charge q3 at the specified position is 
therefore 


(x)= (5.3) 


l 2 193 
4reo | |x — xıl 
The total work aur to build the configuration made of these three 
charges is wÊ + we 


ext? 


We = — | (5.4) 


|x — x2| 


q192 


1 q143 
ÅTEo |x2 a xıl 


[x3 — xı| 


Wext = 


p 2R | (5.5) 


[x3 — x2] 


Note that the expression is fully symmetrical with respect to the charges, 
independently of the order in which they are brought from infinity to 
their final position; as we have seen in Section 4.1.3, it is also independent 
of the path chosen to bring the charges in their final positions, and is 
therefore a function of the final configuration only. 

It is clear that we can now proceed iteratively and, for a system of N 


charges, 
Wext = — 9 3 


a=1b>a 


da Wb 
[Xa — xol = x| ` 


(5.6) 


Since the energy of a (non-relativistic) system is the same as the work 
made by an external agent to build it, the electrostatic energy Eg of a 
static system of point charges is 


(5.7) 


where the subscript “p.p.” stands for “point particles”, and we included 
in the sum over b both b > a and b < a, so that each pair is counted 
twice, and we compensated this dividing by two. From eq. (4.11) we 
see that the electrostatic potential felt by the a-th charge because of the 
interaction with all other charges is 


N 
1 qb 


Are T [xa — xol ’ 


ba(X1,---Xn) = (5.8) 


5.2 Energy stored in a static electric field 101 


so eq. (5.7) can also be written as 


(Ex)pp. = j Dite (5.9) 


The formal generalization of eqs. (5.7) and (5.9) to a continuous charge 
distribution p(x) are 


= 1 3.93 , P(X) p(x ’) 
Ex = = | oe T (5.10) 
and 1 , 
En= 3 | Peo), (5.11) 


respectively. We note, however, that in the continuous formulation the 
condition a # b that appears in eq. (5.7) is absent. We will discuss this 
point in great detail in Section 5.2.2. 

Consider now the situation where we have two continuous charge dis- 
tributions p;(x) and p2(x), localized in two non-overlapping volumes Vj 
and V3, a wey, Setting p(x) = pi(x) + p2(x) in eq. (5.10), we get 
En = EP +E + Ei where 


(a)_ 1 Bard 1 Pa(X)Pa(x’) T 
en - | ca aa : (5.12) 


(with a = 1,2), can be seen as the electromagnetic “self-energy” of the 
a-th charge distribution, while 


Ent — za | ee Br ee x)p 2(x ") (5.13) 


[=x] 


is the interaction energy between the two charge distributions. 


5.2 Energy stored in a static electric field 


The energy stored in a static electric field can be obtained from the 
general results of Section 3.2.2. We will see how it can be rewritten in 
different useful forms. We will then take the limit of point-like charges, 
and compare with the work needed to build the corresponding charge 
configuration, that we computed in Section 5.1. This computation is 
instructive, also because it allows us to illustrate some subtleties in 
the passage from a continuous charge distribution to a set of point-like 
charges. 


5.2.1 Continuous charge distribution 


We start from the expression (3.41) for the energy density of the elec- 
tromagnetic field. In the context of electrostatics, B = 0 and E depends 
only on x. Furthermore, E can be written in terms of ¢ as in eq. (4.30). 
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Then the electrostatic energy in a volume V, sufficiently large to include 
all the charges under consideration, is 


Ex = 2 | rE? (x) 


_ 0 3 , _ 2 
= g [ @x [V-(¢V¢) — 6V"¢] . (5.14) 


As we already saw in the discussion of eqs. (4.41) and (4.42), for a 
localized charge distribution, the term V-(¢V@) gives a boundary term 
on the surface OV, that vanishes when we send the integration volume to 
infinity. Furthermore, in electrostatics ¢ obeys Poisson’s equation (4.3). 
Then, we get 


En = 5 | Px plxo(x), (5.15) 


where we have sent V — oo, so the integration is now over all of space, 
in order to eliminate the boundary term. The solution of Poisson’s 
equation (4.3) for a generic charge distribution is given by eq. (4.16). 
Inserting it into eq. (5.15) we get 


jf Prex p(x) p(x ) (5.16) 


x- x|’ 


EE 


— BTEo 


which allows us to express the energy stored in the electric field, in the 
electrostatic limit, in terms of the charge distribution that generates it. 

These results agree with eqs. (5.10) and (5.11), that were obtained 
computing the work performed to assemble a configuration of point 
charges and performing a “naive” generalization to a continuous dis- 
tribution. As we mentioned, a subtle point is that, in the point-particle 
formulation, ¢q is the electrostatic potential generated by all charges 
except the a-th charge, see eq. (5.8). In the continuous formulation, a 
condition equivalent to b Æ a in eq. (5.8) is absent. We will examine this 
point in Section 5.2.2. Barring a clarification of this point, we have then 
found that the energy of a static charge configuration, defined as the 
work done by an external agent to assemble it, is the same as the energy 
stored in the electric field that this charge configuration generates. 

A first comment on these results is that the previous manipulations 
raise a question on the uniqueness of the expression for the energy den- 
sity. From eq. (3.41), we would naturally conclude that the energy den- 
sity of the electromagnetic field, in the electrostatic limit, is 


€o 
u(x) = SIE)’, (5.17) 
as we wrote indeed in eq. (3.43). In contrast, eq. (5.15) might suggest 


that we identify the energy density with 


u(x) © So(x)o(x) (5.18) 
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Locally, the two expressions are very different. In particular, u(x) in 
eq. (5.18) is localized on the position of the charges, and vanishes in 
regions where p = 0 (so, in particular, for a set of point-like charges 
it would be a sum of Dirac delta functions), while the expression in 
eq. (5.17) is non-vanishing even in charge-free regions. Also note that 
|E(x)|? is definite positive, while p(x)@(x) is not. The correct interpre- 
tation is that the energy density of the electromagnetic field is given by 
eq. (3.43) and so, for electrostatics, by eq. (5.17). Indeed, the deriva- 
tion leading to eq. (3.43) was completely general, independently of the 
specific form of the charge and current distributions in Maxwell’s equa- 
tions. In contrast, eq. (5.15) only holds in electrostatics. Basically, what 
we have done has been to use the equations of electrostatics to rewrite 
the integral over |E(x)|? as an integral over a different integrand that 
gives the same result. This, however, was specific to electrostatics, and 
even to the situation when the integral is over all of space rather than 
over a finite volume (otherwise we cannot discard the boundary term), 
while eq. (3.43) is much more general. For instance, we will see in Chap- 
ter 9 that we can associate an energy density even to electromagnetic 
waves propagating in vacuum, in which case the source terms are just 
vanishing. Therefore, no special general meaning should be attached 
to p(x)¢(x)/2, apart from being a function whose spatial integral over 
R, in electrostatics, happens to coincide with the spatial integral of 
€9|E(x)|?/2. The general expression for the energy density of the elec- 
tromagnetic field is given by eq. (3.43).4 

A second comment is that the integrand in eq. (5.16) becomes singular 
when |x—x’| > 0. We must therefore understand under what conditions 
the integral converges, since only in that case eq. (5.16) provides a well- 
defined expression for the energy of a static charge distribution. The 
issue is clearer in Fourier space. According to the definition (1.100), we 
write 


3 
ot) = f Ee Ae, 


and we use eq. (1.115) for the Fourier transform of 1/|x|, which allows 
us to write 


(5.19) 


1 _ f Bk AT iex’) r 
Ix — x’ i K2“ à vey) 
where k = |k|. Then eq. (5.16) can be rewritten as 
MEET 
E= Zep (27)3 k2 K (5.21) 


The possible divergence of the integral in eq. (5.16) as |x — x’| > 0 
translates into a possible divergence of eq. (5.21) at large k. This is to 
be expected from the general properties of the Fourier transform since, 
in a relation such as (1.100), the term f(k)e’** describes features of the 
function f(x) at scales |x| ~ 1/|k|, so the Fourier modes with large k 
describe the short-distance behavior. Writing d°k = k?dkdQ, eq. (5.21) 


“As we have discussed in Note 11 on 
page 47, there is some potential ambi- 
guity even for this expression of the en- 
ergy density, corresponding to the pos- 
sibility of adding to it a term V-v, 
as in eq. (3.45). As we mentioned in 
Note 11, this, however, can be set to 
zero appealing to the Lorentz covari- 
ance of the energy-momentum tensor in 
Special Relativity, or even better, using 
General Relativity to identify the en- 
ergy density that couples to the gravi- 
tational field, which is indeed given by 
eq. (3.43). 


>The explicit computation goes as fol- 
lows: 


[eae ADRK) 
|x — x’| 


= f teda f dki ko d?k3 
(27)3 (27)3 (27)3 
: AT , , . 
x p(k, )e**1-* 2 eke -(x-x ) p(kg)etk3 x" 
2 
=4n f dk, d?kg d?k3 p(ki)A(ks) 
(27)3 (27)3 (27)3 k2 


X Jeekay fer ci(ks—k2):x’ 

_ Í dky d?kə d?k3 p(ki)p(ks) 
(27)3 (27)? (27)3 k2 

x (27)38®) (kı + k2)(27)3 5°) (kg — k2) 


_ dki p(—k1)p(k1) 
an f (n)3 K y 


Renaming kı = k we get eq. (5.21). 


104 Electromagnetic energy 


6This condition is sufficient, but it is 
not necessary. A term in #(—k)p(k) 
that goes to zero more slowly could still 
give a vanishing contribution when in- 
tegrated over the solid angle. 


"This part can be skipped at first read- 
ing. The bottomline of the follow- 
ing discussion is that these divergent 
self-energy terms, once regularized (i.e., 
suitably treated from the mathematical 
point of view), are just constants, and 
can be discarded. 

The reader with experience in quantum 
field theory, on the other hand, will no- 
tice that, even if we are in a purely 
classical context, we have patterned our 
discussion in analogy to the procedure 
of regularization and renormalization in 
quantum field theory. In Section 12.3.1, 
we will expand on this approach, and 
we will compare it to attempts to deal 
with these problems by building a clas- 
sical model of an extended electron. 


becomes 


1 co 

ER = oy | dk f a p(—¥) p(k). (5.22) 
The condition for convergence is therefore that the Fourier modes p(k) 
go to zero as k — oo sufficiently fast, so that this integral converge. A 
sufficient condition, for this, is that #(k) goes to zero faster that 1/k!/?, 
so p(—k)p(k) goes to zero faster than 1/k.° Therefore if, in this sense, 
p(x) is sufficiently smooth on short scales, the integral in eq. (5.21) 
land, therefore, that in eq. (5.16)], converges and provides a well-defined 
expression for the energy generated by a static charge distribution. 


5.2.2 The point-like limit and particle self-energies 


The most notable exception to the smooth behavior defined above is 
obtained for an ensemble of point-like charges, described by Dirac deltas. 
Consider a set of charged particles with charges qa and fixed positions 
Xa, with a = 1,..., N. In this case, 


N 
p(x) = x gad) (x — Xa), (5.23) 
a=1 
and, from eq. (1.99), 
N 
Pk) = X ae" (5.24) 
a=1 


We see that, for point-like charges, the Fourier modes p(k) do not even 
go to zero as k —> oo, and the integrals in eqs. (5.16) and (5.21) diverge. 
Indeed, if we naively insert eq. (5.23) into eq. (5.16), we get 


(5.25) 


The question mark stresses that what we are doing is not correct, since 
p(k) in this case does not satisfy the convergence condition; as a result, 
the right-hand side is divergent, because of the contribution of the terms 
witha=b. 

To understand the meaning of this apparently nonsensical result, con- 
sider first the terms in eq. (5.25) with b Æ a. These are just the terms 
that we denoted as (Eg)p.p. in eq. (5.7). As we have seen in Section 5.1, 
this is equal to the work that one must perform to assemble this dis- 
tribution of charges, starting from a set of charges at infinite distances 
from each other, and therefore agrees with the definition of energy of 
the system as the work needed to build the given configuration. 

So, the terms with a Æ b in eq. (5.25) give precisely the result expected 
for a system of point-like particles with Coulomb interaction, and the 
trouble comes from the terms where a = b. This is a sort of “self-energy” 
term, and it is interesting to understand the origin, and the cure, of this 
divergence in some detail.” The problem is in the assumption of exactly 
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point-like particles, and its resolution comes from the realization that 
the notion of an exactly point-like particle is a mathematical idealiza- 
tion that is useful in many situations, but this is one case in which the 
idealization leads us astray. When we write the charge density as in 
eq. (5.23), we are implicitly assuming that we know the structure of 
an elementary particle down to infinitely small length scales or, equiva- 
lently, eq. (5.24) assumes that we know the Fourier modes of its charge 
distribution up to infinitely large values of k. However, the structure of 
elementary particles can only be discussed in the framework of quantum 
mechanics or, in fact, quantum field theory. Within the context of a 
classical treatment, the best that one can do is to acknowledge one’s 
ignorance of physics at sufficiently small scales, where quantum effects 
enter into play. If we denote by £ the length-scale below which a classical 
treatment of elementary particles breaks down, this amounts to saying 
that we do now know the Fourier modes of the charge distribution for 
modes with |k| Z O(1/). We should then put a cutoff on the integrals 
over d?k, integrating only over the Fourier modes with |k| smaller than 
some cutoff value A of order 1/2, and deferring the proper treatment 
of higher values of |k| to quantum mechanics and quantum field the- 
ory. Then eq. (5.23), that, using the integral representation (1.79) of 
the Dirac delta, can be written as 


= "J (27) , 


should be replaced by 


~ dk ik- (x—Xa) 


a= 


The resulting expression is a smoothed approximation to the Dirac delta, 
of the type of the functions used in Section 1.4 to approximate the Dirac 
delta (which is actually a distribution) with a sequence of “normal” 
functions, and with a smoothing length-scale 4 ~ 1/A. Equation (5.27) 
is formally equivalent to setting to zero all Fourier modes of p(k) with 
|k| > A and therefore has the effect of restricting also the integral in 
eq. (5.21) to |k| < A. Then eq. (5.21) becomes 


1 dk p(—k)A(k) 
(Ex) reg "m 2€0 La (27)? k2 ; (5.28) 


and is now finite. The subscript “reg” stands for “regularized” and 
means that now this expression is, at least, mathematically well defined. 
We can then insert eq. (5.24), which is still valid for the modes with 
|k| < A, into eq. (5.28). This gives 


ak elk (Xa—Xp) 


1 
(Ex) reg = T a8 OSE -p (5.29) 


= etk (Xa—Xx»b) 1 


N 
dk 1 
=— wm f + a f =. 
ayy wea GP ieee ea OP 


a=1b>a 
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8See Section 12.3.1 for a more accu- 
rate formalization of this subtraction, 
within the logic of renormalization. It 
would also be tempting to interpret the 
self-energy term as the origin of the rest 
mass of a charged particle. Then, for an 
electron, one could be tempted to fix £ 
from [1/(47€0)](e?/£) = mec?, which 
would give 


(5.32) 


We will meet rọ again in Section 16.2, 
where we will see that it enters the 
scattering of electromagnetic waves off 
free electrons and is called the “clas- 
sical electron radius.” The idea of 
an electromagnetic origin of the mass 
of elementary particles is suggestive 
but, once again, classical field theory 
is not the right framework for address- 
ing this type of questions. In particu- 
lar, quantum mechanics introduces an- 
other length-scale associated with the 
electron, called the Compton radius, 


h 
ro= 5.33) 


? 
Mec 


(where h is Planck’s constant) that we 
will also meet in Section 16.2. The rela- 
tion between rc and ro becomes clearer 
introducing the fine structure constant 


1 e 
a= —. 
Areo he 

Dimensionally, this quantity is a pure 
number, and has the value a ~ 1/137, 
so a < 1. In terms of a, the relation 
between ro and ro reads ro = arc, 
so quantum mechanics enters into play 
at a scale ro = ro/a > ro. There- 
fore, one should already stop the classi- 
cal treatment at a scale £ ~ ro. Then, 
the self-energy term in eq. (5.31) is of 
order amec?, so should rather be seen 
as a small electromagnetic correction 
to the rest-mass of the electron. Once 
again, within our classical discussion we 
cannot push these reasonings too far, 
but the latter interpretation is closer to 
the actual treatment in quantum elec- 
trodynamics, where one starts with a 
“bare” mass term for the electron, and 
radiative effects, computed as an ex- 
pansion in powers of the small param- 
eter a, give corrections to it producing 
a “renormalized” mass, which is then 
identified with the observed mass; see 
e.g., Maggiore (2005) for an introduc- 
tory textbook. 
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The terms with a ¥ b have a finite limit for A > oo. This means that 
they are insensitive to the details of the structure of elementary particles 
at short distances, and for these terms we can take the limit A — oo. In 
this limit, using eq. (1.115), we get back eq. (5.7). The integral in the 
second term is easily computed analytically: passing in polar coordinates 


i M 1 A 
—. | kdk | Q=, 
erl fe k? 2r?’ 


and diverges if we remove the cutoff, i.e., if we send A — oo. The result 
become nicer defining l = 7/A (note that £ has dimensions of length). 
Then, apart from terms that vanish as £ > 0, 


1 N N 
(Ex) reg = Arco D> 
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(5.30) 
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The last term can be interpreted as a “Coulomb self-energy” term, as- 
sociated with individual charges, rather than to their interaction. This 
interpretation should not be taken too literally, first of all because, again, 
classical electrodynamics is not the proper framework for studying the 
nature of elementary particles. Observe, furthermore, that the exact 
numerical value of the self-energy term depends on the details of the 
regularization procedure. For instance, instead of putting a sharp cutoff 
setting to zero all modes with |k| > A, we could have used a smooth 
cutoff that suppresses them gradually, and the precise numbers would 
have been different, so this term reflects the details of our mathematical 
regularization procedure, rather than a direct physical property. The im- 
portant points, however, are: (1) this term is divergent if we remove the 
cutoff, sending  —> 0. For this reason, it was necessary to “regularize” 
eq. (5.21) before applying it to point-like particles. (2) This “self-energy” 
term, for given £, is just a numerical constants. The energy of a static 
system of charges can be redefined as the difference between its value in 
a given configuration and the value when all charges are at an infinite 
distance among them, since this represent the work that should be done 
to assemble the charges in the given spatial configuration; the charges 
themselves are taken as given, and we must not perform work to create 
them. The constant self-energy term then cancels in this difference and 
can therefore simply be discarded. We then get back the expression (5.7) 
of the Coulomb energy of a system of charges interacting among them.® 

In summary, eqs. (5.16) or (5.21), which have been derived from the 
general formula (3.41) for the energy of the electromagnetic field, give 
the energy density of a localized and static charge distribution p(x), 
as long as the latter is sufficiently smooth on short scales. Basically, 
the exception to this behavior is given by point-like particles. In this 
case, one must be aware of the fact that the notion of point-like particle 
is just a mathematical idealization and, after a suitable regularization 
procedure and the subtraction of a constant term (that would diverge 
when removing the cutoff), one obtains the expected result (5.7) for the 
Coulomb potential energy of a set of charges. 


(5.31) 
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5.2.3 Energy of charges in an external electric field 


Another important quantity is the electrostatic energy of a system of 
charges in an external electric field. Eventually, the external field will 
itself be generated by a set of charges. So, we can model the situation 
considering two sets of spatially separated charges, localized into two 
non-overlapping volumes V; and V2. Then, using a discrete formulation 
first, the sum over a in the equations of Section 5.1 splits into a sum 
over the charges in V; and those in V2, and eq. (5.7) becomes 


2 H a 


a,bE V1 ,bža aE€Vı bEV2 = a bE V2,b4a 
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The first term gives the interaction energy among the charges in Vj, 
while the third term gives the interaction energy among the charges in 
V2. The middle term is the interaction between the two groups.” Then, 
the electrostatic interaction energy between the two groups of particles 
is given by 


=) 2 — j (5.36) 
NEO CEV, beVa 9 j 
We next observe, from eq. (4. a that 
ex 5.37 
Pent( = 2 E (5am) 


bEV2 


is the potential generated by the charges in Vj at the point x. The label 
“ext” stresses that we consider this as an “external potential,” from the 
point of view of the charges in V;.'° Correspondingly, we denote the 
expression in (5.36) as (Ez)ext, to stress that this is the electrostatic 
energy of a set of particles (the particles in V1) due to the interaction 
with a given external field. Then, combining with eq. (5.37), we get 


DD qaQoxt( Xa) s 


aEVı 


(Er)ext = (5.38) 


[Note the absence of the factor 1/2, compared to eq. (5.9)]. The contin- 
uous versions of these results are obtained from eq. (5.13), considering 
the charge density p2(x) = Pext(x) as an external source, and therefore 
are given by 


ce J de p(x) boxe (X) (5.39) 
where we denoted p(x) simply as p(x), and 
= 1 37 Pext (x’) 
dext(X) = ae fa x ea]? (5.40) 


is the scalar potential generated by the second charge distribution, which 
can be seen as an external potential from the point of view of the first 
charge distribution. 


Note that, there, we could omit the 
condition b Æ a since in this sum a and 
b run over different groups of particles, 
so this condition is automatically satis- 
fied. 


10 This distinction is useful in partic- 
ularly when the back-reaction of the 
charges in Vj on those in V2 is small 
(for instance, the charges in Vj are 
a small group of elementary charges, 
while those in V2 form a macroscopic 
object, e.g., a capacitor, creating a 
macroscopic electric field), so we can 
consider the external field as given, in- 
dependently of the motion of the parti- 
cles in Vj. 


l1We will then label again the corre- 
sponding energy as (Exz)ext, to stress 
that it is the energy of pı (x) in an ex- 
ternal field, while in eq. (5.13) it was la- 
beled pm, to stress that it is the inter- 
action energy between p1 (x) and p2(x). 
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5.2.4 Energy of a system of conductors 


Consider a set of extended charge distributions p(x), labeled by an 
index a = 1,...,N, each localized in a volume Va, non-overlapping 
with the other volumes. Then p(x) = D Pa(x), and the integral in 
eq. (5.15) splits into a sum over non-overlapping volumes, 


N 
Ex = H1 dz pa(x) (x) - (5.41) 


In the special case in which these N bodies are conductors, this ex- 
pression simplifies considerably. Recall from Section 4.1.6 that, for con- 
ductors at equilibrium, the charge density is zero inside the conductor, 
and non-vanishing only on its surface. Furthermore, we found that, on 
the surface, the electrostatic potential is constant. Denoting by ¢, the 
constant value of ¢(x) on the surface of the a-th conductor, eq. (5.41) 
becomes 


N 
Ex = > ba f dx pa(x) , (5.42) 
a=1 a 


where we could extract $(x) from the integrals, since pq(x) is propor- 
tional to a two-dimensional Dirac delta on the surface of the a-th con- 
ductor, and there ¢(x) = da. The remaining integral is the total charge 
Qa of the a-th conductor, so we see that the energy of an ensemble of 
conductors can be expressed as 


1 N 
Ep = 2 5 QaPa (5.43) 
a=1 


which is analogous to eq. (5.9) for point particles. As we discussed in 
Problem 4.3, for a set of conductors the charges Qa and the potential ġa 
are linearly related, through eq. (4.163) or the inverse relation (4.164). 
Therefore, eq. (5.43) can be rewritten in terms of the potentials ¢, and 
the coefficients of capacitance Cy, as 


1 N 
En=3 do Carbadr (5.44) 


a,b=1 


or in terms of the charges Qa and the coefficients of potential Pa» as 


i N 
En=5 XO PQQ. (5.45) 


a,b=1 


In Problem 5.1, we will check this expression for a single capacitor, 
computing the work need to assemble its charge configuration. 

The relation between the energy of a system of conductors, and the 
mechanical potentials from which we can derive the forces acting on 
them is somewhat subtle and depends on whether the conductors are 
kept at fixed charge or at fixed electrostatic potential. We will examine 
this in Section 5.5.1. 
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5.3 Work and energy in magnetostatics 


We now discuss the corresponding issues in the context of magneto- 
statics. We begin by computing the work necessary to build a static 
magnetic field configuration. The source of the magnetic field is a cur- 
rent distribution j(x), and we must therefore ask what work is necessary 
to build up such a current distribution, starting from zero. 

For the sake of the argument, imagine an external agent that “grabs” 
an electron at rest, and starts to accelerate it, moving it in a circular 
orbit, until it reaches a given velocity corresponding to a given current. 
Concretely, a current will rather be produced by accelerating a large 
number of electrons, and the force that accelerates them will be an ex- 
ternal electric field, but we do not need to specify this for the folowing 
argument. If we neglect for a moment any effect related to electromag- 
netism, we would conclude that the work done by the external force goes 
into the kinetic energy of the electron; or, in the more realistic setting 
of an electric wire, it goes partly in the kinetic energy of the electrons 
and partly into dissipation in the wire. Once we include Maxwell’s equa- 
tions in our considerations, we realize that the final configuration, with 
a given final current, generates a magnetic field. We already know that 
a magnetic field carries energy, so energy conservation must now take 
a different form: to conserve energy, the work that has been done by 
the external force during the process of building the final configuration 
must be equal to the kinetic energy of the final electron (or the kinetic 
energy of the electrons in a wire plus the energy lost to dissipation, in 
a more realistic setting) plus the energy of the final magnetic field. In 
other words, the work done by the external force to accelerate a particle 
to a given velocity must be different when the particle is charged from 
when it is electrically neutral, and the difference must be precisely the 
work done to create the corresponding magnetic field. 

The reason why the work is different for a charged and for neutral 
particle is that the charged particle interacts with the electromagnetic 
field that is being built. The magnetic field, by itself, does no work 
on the charged particles on which it acts, given that (vxB)-(vdt) = 0. 
However, as we raise the current, the magnetic field that it generates 
also raises. Therefore, during the transient period needed to reach the 
final static field configuration starting from zero initial field, B /ðt is 
non-zero. According to Faraday’s law (3.4), a non-zero value of 0B/0Ot 
generates an electric field (the “induced” electric field). The induced 
electric field performs work on the charged particles that make up the 
current and, as we have seen in Section 4.3.1, for a closed loop it acts 
to oppose the increase of the magnetic flux (Lenz’s law). So, we have 
to supply extra external work to counterbalance the work made by this 
induced electric field during the transient period needed to reach the 
desired final value of the magnetic field, even if this final configuration is 
static. One might try to circumvent the problem by raising the current 
very slowly, so that OB/Ot is very small, and, at any given time, the 
effect might seem negligible. However, in this case it would take a very 
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long time to reach the desired final value of j and, as the following 
computation will make clear, the two effects compensate. 

The work per unit time made by an electric field on a current distri- 
bution j(x) has already been computed in eq. (3.48). The work made 
by the external agent is the negative of this, so, at a time t during the 
transient period, when the current has a value j(t, x), we have 


aes =- J d°x E(t, x) j(t, x), (5.46) 


where E(t, x) is the induced electric field at that time. In a magneto- 
static setting, we want to raise the current very slowly; then the Ampére— 
Maxwell law (3.2) reduces to Ampére’s law, and 


q(x) = 2y xB. (5.47) 
Ho 
In other words, while we must take into account 0B/Ot during the tran- 
sient period, otherwise we miss the leading contribution to Wext, we can 
assume that the time derivative of the induced electric field is sufficiently 
small to be neglected. Then eq. (5.46) becomes 


a= ~ f d'cE(Y xB) 
dt Ho 
= -— | rB (V xE) 
Ho 
= = Bap. 2B 
Ho Ot 
ae. sfe IB(t, x)|? (5.48) 
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where, in the first line, we have integrated by parts, so that F;(eijkôj Br) 
gives —(€:;40; Ei) Br, which is the same as +(€:;,0;E;) Br = B.((V x 
E), and in the second line we used Faraday’s law (3.4). Therefore, 
integrating with respect to time, with the initial condition that W..4 = 0 
at the initial time when B = 0, we find that the work needed to obtain 
a given final static field B(x) is 


1 
ext = =— | dx|B)? A 
Woe = z= f da BP, (5.49) 
in full agreement with the general result (3.41) with E = 0. Therefore, 


the energy associated with a static magnetic field B, defined as the work 
needed to build the configuration of current that generates it, is 


ee! 3, 2 
Es = g | PaB., (5.50) 


and agrees with the expression of the energy stored in the final magnetic 
field obtained from eq. (3.41). 
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5.4 Energy stored in a static magnetic 
field 


In this section, in analogy with the treatment in Section 5.2 for electro- 
statics, we rewrite the energy (5.50) in some other useful forms, in terms 
of the vector potential A(x) corresponding to the final magnetic field 
and of the final current distribution j(x) that generates it, and we will 
also write the corresponding expression when the current distribution 
corresponds to a set of closed loops.'? We write 


Ep = A ee 

210 

= — | #eB(VxA) 
210 
1 
os ( ) 
1 

= z | Eria, (5.51) 


where in the second line we integrated by parts [similarly to the passages 
made after eq. (5.48)] and we then used eq. (5.47). Note that the last 
passage is again specific to magnetostatics. Therefore, in magnetostatics 
we can write the energy as 


1 


Ep = 5 | Erioa). 


3 (5.52) 


This is the analog of eq. (5.15) in electrostatics. Another useful form is 
obtained using the solution for A(x) in terms of the current distribution 
given by eq. (4.92),!4 which gives 


EB = Po | Bada! Ieai) a) 
8r |x — x’| 


(5.53) 


This is the analog for magnetostatics of eq. (5.16) in electrostatics. Sim- 
ilarly to eq. (5.21), we can take the Fourier transform of eq. (5.53), and 


write id x 
ma w f d'k j(-k)-j(k) 
B= 
2 J (27)8 k? 

As in eq. (5.21), the convergence of the integral is assured if j(k) goes 
to zero faster that 1/k1/?. 

If j(x) = ji(x) + jo(x), where jı(x) and jo(x) are localized in two 
non-overlapping volumes V; and V3, eq. (5.53) becomes 


(5.54) 


Es = EP 4 EP vem, (5.55) 
where Harai 
60) = 40 feeda EiT (5.56) 


12Note that, in the electrostatic case, 
we considered separately a discrete and 
a continuous distribution of charges. 
In magnetostatics, currents are neces- 
sarily extended, so we work directly 
in the continuous formalism. Alterna- 
tively, one could consider the interac- 
tion between two closed loops, with a 
size small compared to their distance. 
We will discuss this setting in the con- 
text of the multipole expansion in Sec- 
tion 6.3, where we will see that this cor- 
responds to the magnetic dipole inter- 
action. Since the equivalent of the elec- 
tric charge (the electric “monopole,” 
in the language of multipole expan- 
sion) does not exist for the magnetic 
source, the analogy with electrostatics 
is clearer working directly in terms of 
extended distributions. 


13 Observe that this expression is gauge 
invariant, as it should be for an energy. 
This is evident from the original ex- 
pression (5.50), because it only involve 
B, which is gauge invariant. Equa- 
tion (5.52), in contrast, is written in 
terms of A, which is not gauge invari- 
ant. However, under the gauge trans- 
formation A > A — V9, 


[eras [eri(a-vo), 


and the extra term vanishes after inte- 
grating by parts and using V-j = 0, 
which is valid in magnetostatics, see 
eq. (4.90). 


14This solution for A was obtained in 
the Coulomb gauge but, as shown in 
the previous note, eq. (5.52) is anyhow 
gauge invariant. 
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15 Note that, as in eq. (5.13), which 
is the analogous result for electrostat- 
ics, the integral over d?a takes contri- 
butions only when x is inside the vol- 
ume Vj where ji(x) is non-vanishing, 
and the integral over d?x’ takes contri- 
butions only when x’ is inside the vol- 
ume V2 where jo(x’) is non-vanishing. 
Therefore, having taken Vj and V2 non- 
overlapping, |x — x’| is always above a 
minimum value and there is no prob- 
lem of convergence in the integral in 
eq. (5.57). 


(with a = 1,2), can be seen as the “self-energy” of the a-th current 
distribution, while 


int Ho f gapay S10) j) 
Ep = — | drr 
2 An als |x — x’| 


(5.57) 


is the interaction energy between the two current distributions. 15 


When the back-action of jı on jg is negligible, from the point of view 
of the first current distribution it can be useful to see the second current 
distribution as generating a given “external field.” We can then rewrite 
eq. (5.57) as 


ines J Baji (x) Acel), (5.58) 


where Ao(x) = Acxt(x) is the vector potential generated by jo(x), ac- 
cording to eq. (4.92). Equations (5.52) and (5.58) can be compared to 
eqs. (5.15) and (5.39) for the electrostatic case. 

While in electrostatic the most elementary source is provided by a 
point charge, in magnetostatics the simplest source is a closed loop car- 
rying a current. We therefore discuss how the previous general expres- 
sions simplify for a collection of loops. We consider a set of closed loops 
Ca, (a= 1,..., N), carrying currents Ia. From eq. (4.103), 


N 
j(x)@Pr = 5 ja(x)d?x 
a=1 
N 
> SO Inde. 
a=1 
Then, using Stokes’s theorem (1.38), we can transform eq. (5.52) as 
ix 
= ng dba A 
a 
ix 
= oe In | ds-(VxA) 
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= = In | ds-B, 
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where Sa is any surface having Ca as the boundary and B is the total 
magnetic field generated by all the loops. The integral on the right-hand 
side is just the magnetic flux of B through Sa, that we denote as Ọga, 
so 


(5.59) 


Eg = 


(5.60) 


pol 
fp=5 XO aan (5.61) 
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which can be compared with eq. (5.43) for a system of conductors. For 
a closed loop C carrying a current J, placed in an external magnetic field 
Bext, proceeding as in eq. (5.60) we see that eq. (5.58) can be rewritten 
as 


(EB )ext = IP Boxi ; (5.62) 


where we denoted the magnetic flux of Bext through C by ®z.,,. 

We now observe that the knowledge of the position and shape of the 
loops, and of their currents, uniquely determines the magnetic field ev- 
erywhere, and therefore the fluxes through the loops. As we have already 
mentioned in Problem 4.8, the linearity of Maxwell’s equation implies 
that these relations are linear, so that 


N 


Opa = 5 Lavlo . (5.63) 
b=1 


The coefficients La» are called the mutual inductances, while the diag- 
onal terms Laa = La are called the self-inductances and depend on the 
shapes and positions x, of all loops. Equation (5.63) is the analog of 
eq. (4.164) in electrostatics. The derived SI units for the inductance is 
the henry (H). 

The mutual inductance between two loops Ca and Cy (with a Æ b) can 
be written explicitly, in terms of the geometry of the loops, computing 
the contribution to the flux through loop a produced by the current 
I,. Denoting this as (®g)a b, and using the same manipulations as in 


eq. (5.60) in reverse order, 
f ds -Be 
Sa 


= p ds- (V x Ay) 
Sa 
= $ dla: As[x(ta)], (5.64) 


(DB )ab 


where B, and A, are the magnetic field and, respectively, the vector 
potential, generated by the loop b, and we wrote explicitly that the 
argument of A, is the value of x that corresponds to the coordinate £la 
on the loop, x(a). Using eq. (4.104), we get 


dl, dhs 
(®z)a ip bom Me) xB (5.65) 


This shows that a relation of the form (5.63) indeed holds, and provides 
an explicit expression for the coefficients Lap, 


ra= i$, $, ime) oat eee 


From this explicit expression we see that Lap = Loa. Using eq. (5.63), 
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This section is quite long and technical, 
and should be skipped at first reading. 


161¢ should be stressed that force, with 
its instantaneous character, is an intrin- 
sically non-relativistic notion that has 
no place in a full relativistic theory, and 
the same holds for the “instantaneous” 
mechanical potential U (in the various 
forms that it can take, that will be de- 
scribed in the following). Therefore, 
the whole discussion in this section is 
only valid for non-relativistic sources. 
In Section 12.2.2, we will see how the ef- 
fective dynamics of a system of charges 
can be written in terms of instanta- 
neous interactions, order by order in 
v/c. 


eq. (5.61) can be rewritten as 


1 N 
= 5 Toila 
a=1 


(5.67) 


This is analogous to eq. (5.44) for conductors. Therefore the a-th loop 
has a self-energy i 

(EB)a = gala ; 
while, using Lap = Lpa, the interaction energy between loop a and loop 
b is 


(5.68) 


(EB)ab = Ladlad, (5.69) 
or, using the explicit expression (5.66) 
T o - dey 
(En) = lh £ f (5.70) 
c, [x(la) — x(6) 


From eq. (5.68), we see that the self-inductance of a loop must be a 
positive quantity and provides a measure of how much energy should be 
given to a circuit to raise its current to a given value. The larger La, 
the larger is the energy that must be supplied to reach a given value of 
I, just as, in the expression Exin = (1/2)mv? of the kinetic energy of 
a non-relativistic particle, the larger the mass, the larger is the energy 
that must be supplied to reach a given velocity. More generally, the 
condition that the quadratic form (5.67) must be positive imposes the 
constraints La > 0 and 


Lala > Ly. (5.71) 
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In non-relativistic mechanics, conservative forces acting on a system can 
be obtained from mechanical potential functions, that we generically de- 
note by U, defined by the fact that the force can be written as F = - VU. 
Similar relations hold for the forces acting on charge and current dis- 
tributions in the full Maxwell theory.! However, one must be aware of 
some subtleties: a correct treatment is analogous to that of thermody- 
namical potentials, where one must specify which, among the variables 
that determine the state of the system, are kept constant, and which are 
varied, and different mechanical potentials, related by Legendre trans- 
forms, are appropriate to different situations. 

Let us begin with the most obvious example, which is the Coulomb 
potential. We consider a system of N point charges, q1,...,qn, located 
at positions x,,...,xy, respectively. The force F, acting on the a-th 
charge is obtained summing the contribution from all other charges, 
which are given individually by Coulomb’s law (4.8), and is therefore 


F — 1 Xa — Xp 
a= Ages me dadb [xa — xp] . 


(5.72) 
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Using eq. (4.19), we see that this force can be written as 


oU, ou 
ec (5.73) 
OXa 
where!” 
da Wb 
Ucout(X1,--+5 as 5.74 
Coul(X1 Xy) =- D i, 30] (5.74) 


a=1 b4a 


This expression defines the Coulomb potential. Comparing with eq. (5.7), 


we see that the Coulomb potential is the same as the energy of a system 
of static charges (which is the same as their interaction energy, given 
that the kinetic energy vanishes for static charges). Thus, eqs. (5.73), 
(5.74), and (5.7) reveal that the mechanical potential from which we 
can derive the force on a static charge due to all other static charges 
is equal to the energy of the whole configuration of static charges. To 
mitigate the feeling that we are elaborating on the obvious, let us an- 
ticipate that we will find in Section 5.5.1 that this result only holds 
because we have implicitly assumed that the charges on the individual 
bodies are conserved. For a set of point particles this is a somewhat ob- 
vious assumption; however, in other contexts other options are possible; 
in particular, for a set of conductors, one could rather fix the value of 
the electrostatic potentials on their surfaces, and in that case we will 
find that the equality between the relevant mechanical potential and the 
energy of a static configuration does not hold. We will meet similar 
situations in magnetostatics. 

First, however, we examine the generalization to a continuous charge 
distribution. The forces between two charge distributions p)(x) and 
p2(x), localized in two non-overlapping volumes V; and Vo, have already 
been computed in eqs. (4.26) and (4.27), while the continuous general- 
ization of the Coulomb interaction potential (5.74) is given by'® 


Ue = za fee da coe wilt 2(x ’) 
-x| à 


We now want to understand how to get the forces in eqs. (4.26) and 
(4.27) by taking gradients of this potential. At first, one might even 
wonder where the variables are with respect to which we could take 
derivatives of UIt, since in eq. (5.75) x and x’ are integrated over and 
therefore do not appear as free variables on the left-hand side. Actually, 
the required variables are hidden in p;(x) and p(x’), through the fact 
that these charge distributions are localized in the volumes V; and V», 
respectively. We take Vı and V2 to be volumes with a fixed shape and 
orientation, so their position in space can be given by specifying the 
coordinates x; and x2 of just one of their points, such as, for instance, 
their geometrical centers. For instance, if V; and V2 are spheres of given 
radii Rı and Rə, we can take x; and x as the respective centers of these 
spheres. A change in x; corresponds to a rigid displacement of the first 
charge distribution, and similarly a change in x2 corresponds to a rigid 
displacement of the second charge distribution. Let us denote by po) (x) 


(5.75) 


17 When computing 0/0xa, one must 
be careful to use a different dummy in- 
dex in the double summation, writing 


e.g, 


OWUecoul _ 
OXa 


— 4edb 
Pee — 


ee 


= = Sipe 


Then, there are two equal contributions 
from c = a and from b = a, that trans- 
form the factor 1/(87e9) in the poten- 
tial into 1/(47e9) in the force. 


18The overall factor 1/(47e0) in 
eq. (5.75) differs by a factor of 2 
from the factor 1/(87eg) in eq. (5.74); 
this comes from the (by now, usual) 
fact that, setting eg., N = 2, in 
eq. (5.74) we have sa Syren with 
the condition b Æ a, so there are two 
equal contributions to the sum, from 
(a = 1,b = 2) and from (a = 2,b = 1). 
The result is then the same as that 
obtained from eq. (5.75) when we 
set pi(x) = qid@(x — xı) and 
p2(x') = 4250) (x! — x2). 
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19The explicit computation is as fol- 
lows: 


0 


Oxi 


py’ (x! — x2) 


a 
|x — x|’ 


where we used ap (x — x})/Ox, = 
—ap° es — x1)Ox, we then integrated 
0/0x by parts (discarding the bound- 
ary term since pı is localized), and we 


finally used eq. (4.19). 


the first charge density distribution in the reference frame where x, = 0. 
Then, in a translated reference frame where the coordinate specifying 
the position of Vı has the generic value x1, we have 


0 
p(x) = p(x — x1). (5.76) 
Similarly, denoting by ps) (x) the second charge density distribution in 
the reference frame where x2 = 0, in a generic frame we have 


p2(x) = pS (x — x2). (5.77) 
For instance, for a point charge qı located at the origin, pw) (x) = 
q6) (x), and then pi(x) = q6°)(x — x1) is the charge density in the 
frame where the charge is located in the position xı. So, xı and x9 
are the “hidden” peer on which the potential depends. The use 
of the densities pw) (x) has the advantage of explicitly bringing out the 
dependence on x; and x, which is otherwise implicit in p; (x), and 
eqs. (4.26), (4.27), and (5.75) can be written, more explicitly, as 


/ 


x—xX 
Fy (x1, X2) = Tr — f èz dx! p\? (x— x) pb) (x! =x) a (5.78) 
lX 
F2(x1, X2) = F dx pw) (x— x) (x! -x) a (5.79) 
and 


0 
pes Br 1 p? (x -x)ø (x! 7 x2) ; (5.80) 


ER x-xi 


ÅTEo 


We can now check that 


ouint 
F =-—+ 5.81 
1(X1, X2) Oxy ( ) 
and, similarly, F2(x1,x2) = —OU#*/Ox2, where it is understood that 


the partial derivative with respect to x, is taken at xə constant, and 
vice versa.!® Also note, for future reference, that the derivatives have 
been taken keeping fixed the total charges in the volumes V; and V2. 

Comparing eq. (5.75) with eq. (5.13) we see that, also in the continu- 
ous case, the mechanical potential from which the forces acting on the 
charges can be derived is the same as the energy that these charges have 
in a static configuration, i.e., their interaction energy. If, instead of just 
two charge distributions pı and p2, we have N of them, labeled as pa (x) 
and localized in volumes Va, identified by coordinates xa, eq. (5.75) is 
generalized by summing over all possible pairs, 


ee Sa J 
a SD f) Prr e (5.82) 


a=1 64a 


int 
Up (x1,..., XN) = 


where, as usual, we have counted each pair in the sum twice, and com- 
pensated for this by an extra factor 1/2; again, the dependence on Xa is 
hidden in p,(x), and can be shown explicitly using instead po) (x —X,). 
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It is useful to define 


Upp] = L n Pada! poe) (5.83) 


87€9 |x—x’| ~ 


Similarly to eqs. (5.12) and (5.13), we now write p(x) = p1(x) + p2(x), 
with p1(x) and p2(x) localized to volumes V; and V2, respectively, identi- 
fied by the positions x; and x2 and, correspondingly, we use the notation 
Up(x1,X2) for Ug|p]. Then, we get 


Un(x1,X2) = US? +Uy? + Up (x1, x2), (5.84) 
where 
a 1 Pa(X)pa(x') 
up = fe Gye es 5.85 
ce 87€9 Ea x-x] ’ pee) 
while U#*(x1,x2) is given by eq. (5.75). In terms of the densities p, 
i 1 p% (x — x1) p (x! — x2) 
int = d? d8 / ; 
Up (x1, X2) Ta J xd’x canal i (5.86) 
while 
(x — (9) (x! — 
(a) 1 3,73, PX (X — Xa) p (x! — Xa) 
Uş = — | d’ad . : 
7 T I xd? x kox (5.87) 


We now observe that, shifting simultaneously the integration variables 
x > X + Xa and x’ > x’ + Xa we can eliminate the dependence on 
Xa in eq. (5.87), since this eliminates the dependence on x, from both 
p (x— xa) and p) (x’—x,), while |x —x’| is invariant under this shift. 
So, in the end, for each a, U (a) is independent of xa. This is physically 
clear, since it means that the self-energy of the a-th charge distribution 
is independent of where the volume V, is located, i.e., is invariant under 
translations. Therefore, we can use Ug(x1, X2) as a mechanical potential 
function in place of UH" (x1, Xa), i.e., 


_ Up 
Ox, ; 


and, similarly, Fo(x1,x2) = —OUg/Ox. The form (5.83) of the mechan- 
ical potential is convenient because it is also valid for N charged objects 
with densities p(x) (a = 1,..., N), localized in non-overlapping vol- 
umes V identified by the position Xa, simply setting p(x) = >>, Pa(x). 
Then, eq. (5.83) becomes 


Fi (x1, X2) = (5.88) 


i os pa(x) p(x") 
Upg(Xı, sae ,Xn) = Brea 5 Jere Trex . (5.89) 
a,b=1 


The terms of the sum with a = b are self-energy terms that, as we have 
seen, are independents of all coordinates x, and do not contribute when 
taking the gradients, while the terms with a 4 b give back eq. (5.82). 
We can also use eq. (5.89) for point charges, with the understanding 
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20 For extended bodies, in principle, one 
should also include the variables that 
define their shapes and orientation in 
space. Such variables could be added 
trivially to the steps that we will per- 
form in the following. To keep the nota- 
tion simpler, we neglect these variables, 
assuming for instance that the bodies 
are spherical, of fixed radii. Each vol- 
ume V, can then be identified uniquely 
by the position xq of its center. The 
treatment discussed here can then be 
generalized to the inclusion of thermo- 
dynamic variables such as the tempera- 
ture, see Chapter 4 of Landau and Lif- 
schits (1984). 


that the self-energy terms are regularized as in eq. (5.31), and then are 
again irrelevant constants. 

Comparison of eqs. (5.83) and (5.16) shows that the mechanical poten- 
tial Ug needed to compute the forces is the same as the total interaction 
energy Eg of the system a static charge distributions, 


Up = Ep. (5.90) 


Given the equality between eqs. (5.15) and (5.16), we can also rewrite 


Ur in the form i 
5 | ax (x) 800), 


where ¢(x) is the total electrostatic potential generated by p(x). 


Up = (5.91) 


5.5.1 Mechanical potentials for conductors 


In the previous section, we have considered a system of N charged bod- 
ies, with charge densities pa(x) localized in non-overlapping volumes Va, 
so that the total charge density is p(x) = 7, Pa(x). We have identified 
the position of the volume V, by a coordinate x, that describes rigid dis- 
placements of the volume (corresponding, e.g., to the geometrical center 
of the body). The total charge of each extended body is 


= frot 


Since pa(X) vanishes outside the volume Vj, it is actually not necessary 
to restrict explicitly the integral in d°x to the volume V,. We have then 
found that the force acting on the a-th charge can be written as 


(5.92) 


o 
Fa = -7x (Qili QN; Xipe XN); (5.93) 
where Up can be written in the equivalent forms 
1 x! 
OH Oisas On ious, XN) = fès gp Pee) 
8p |x — x’| 
d? dx 1 Pa(X)po(x’) x). o(x ’) 
= a ay í  jx=x] 
(0) (or 
Brda , P (x — Xa) p (x! — xs) BOA 
= a 2] |x — x'| Oa) 


Compared to the previous section, we have slightly changed the notation, 
also writing explicitly the dependence of Ug on the charges Qa.?? This 
different notation reflects a broader of point of view, inspired by the 
treatment of thermodynamic potentials in statistical physics. First of 
all, we have already observed in Section 5.1 that the work needed to 
build a configuration of charges is independent of the order in which we 
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bring the charges from infinity, and on the path that we follow to bring 
them to their final positions. In the language of thermodynamics, this 
means that Ug in eq. (5.94) is a function of state, i.e., depends only on 
the variables Q = {Q1,... Qn} and x = {xj,...,xy} that identify the 
final state, and not on how we reached it. A second point to stress is 
that the definition of functions of state depends on which quantities are 
kept constant during the process of building the given configuration. In 
the case where we built a configuration of point charges bringing them 
from infinity, it was implicit that the charges of the individual particles 
were kept constant. For a set of elementary charges this was an obvious 
assumptions, since the electric charge associated with each elementary 
particle is a conserved quantity. However, now it is useful to keep this in 
mind and, in eq. (5.94), we stressed this by writing explicitly Qi,...QN 
among the arguments of Ug. 

We now restrict to the case of extended conductors. In this case, 
starting from eq. (5.91) and performing the same steps as in eqs. (5.41)— 
(5.43), we can write 


1 N 
Ug = 5 5 QaQa. (5.95) 
a=1 


A more accurate notation is 


Um Qiyes ONT Ri, sac, XN) 


N 
1 
32 Qabal Qi,-- ‘ONT cag XN), 


(5.96) 
which stresses that, for given values of the charges Qa, the value of the 
electrostatic potential at the surface of the a-th conductor, ġa, depends 
on the positions x;,...,x, of all conductors (as identified by their cen- 
ters, see Note 20 on page 118), as well as on their charges. 

We now want to compute the partial derivative of Ug with respect to 
Qa, for a given a, while keeping fixed all other charges, and all the vari- 
ables x1,..., Xy. Starting from eq. (5.96) would require us to compute 
the derivative of da with respect to Qa. Actually, it is simpler to use 
Ug in the form given in the second line of eq. (5.94), and compute how 
Ug changes under 

Pa > Pa + Opa - (5.97) 


We denote this variation by ĝa, so that alpo (x)| = davdPa(x), where dap 
is the Kronecker symbol. Then (taking care of changing the name of the 
dummy summation indices), 


N 
ie 3,73, OalPe(X) po(x’)| 
6,UR = ae - fa ad°x Eo 


2 S fada 7 Bape) x") + pex) ldap x")] 


zoe = |x — x’| 


= 3 Bt Pol) 
= Bd | fedeo x) fa E (5.98) 
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where the second variation in the second line gives the same contribution 
as the first, after renaming x + x’ and b & c. We now observe that 


ey DT dik 
mji E = o) (5.99) 


is the total electrostatic potential generated by the charge densities. 
Therefore 


aUn = pes dpa(x) O(a). (5.100) 


These manipulations were valid for generic charge distributions. We 
now specialize to conductors using again the fact that, for conductors 
at equilibrium, the charge density is zero inside the conductor, and non- 
vanishing only on its surface, and that the electrostatic potential on the 
surface is constant. Denoting again by ¢, this constant value of $(x) 
on the surface of the a-th conductor, and using the fact that ĝpa(x) is 
proportional to a two-dimensional Dirac delta on the surface of the a-th 


conductor, eq. (5.100) becomes 
ba | Baspa) 


= $a6Qa, (5.101) 


aU E 


where 6Q, is the variation of the charge Qa of the a-th conductor, in- 
duced by the variation (5.97) of its charge density. This shows that, for 
a set of conductors, 


o 
——Up(Q1,... QN; X1,- XN) = Qa, (5.102) 
OQa 
where ġa is the total electrostatic potential on the surface of the a-th 
conductor. Therefore, for conductors we can write 


N 
OUR UE 
dUg = ( Jor. rs > (He ) -dXq 


N 
5 (dadQa i Fe dXa) ; (5.103) 
a=1 


where the subscripts in the partial derivatives indicate the variables 
that are kept constant when taking the partial derivative: the subscript 
{Q’,x} in 0UzZ/0Q, means that we keep fixed all the charges except Qa, 
and all the x, while the subscript {Q,x’} in 0Ug/Ox, means that we 
keep fixed all the charges, and all the x except Xa. In this more precise 
notation, we can rewrite eqs. (5.93) and (5.102) as 


_ (Ue 
da = (So) ; (5.104) 


F, = -(%) . (5.105) 
Xa Q.x! 
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Therefore, we can see Up as a “mechanical potential function at fixed 
charge,” in the sense that, from it, we can obtain the mechanical forces 
by taking (minus) the gradient with respect to xq, at fixed charges. 

For a set of point-like particles, working at fixed charges is the only 
natural option, so, in Section 5.1 we computed the work needed to as- 
semble a given configuration of point charges, bringing them one by one 
to the desired final location, while keeping their charges constant. For 
conductors, however, there are two natural options. The first is to keep 
a conductor at fixed charge as we move it from infinity to the desired 
location. This is automatically obtained if the conductor is isolated, so 
that the charge on it is conserved. Note that, if a conductor is isolated, 
a second conductor that is moved toward it induces a change in the 
electrostatic potential of its surface, so Qa is constant while a is not. 

Another possibility is to keep the electrostatic potential at its surface, 
Qa, at a fixed value. For this, the conductor cannot be isolated. For 
instance, it might “grounded,” i.e., connected by a thin conducting wire 
to the Earth, so that the potential at its surface is at the same value 
as the potential of the Earth, taken to be the reference value ¢ = 0; 
or, more generally, it could be connected to some external source such 
as a battery, that keeps its surface to the desired value of ġa. The 
external source, such as a battery or the Earth, exchanges charges with 
the conductor connected to it, in order to re-equilibrate the effect of the 
external disturbances and keep the electrostatic potential on its surface 
constant. In this case, the work done to reach the final configuration 
must be computed at constant @,, rather than at constant Qa. 

We therefore want to define another potential function Ug, which 
is the appropriate one when we keep fixed the electrostatic potential, 
rather than the electric charge. The relevant tool here is the Legendre 
transform, which is the standard technique used in similar contexts in 
classical mechanics and in thermodynamics. Suppose that we have a 
function of two variables U (Q, x), so that 


OU OU 
w= (55) 4 | Cake (5.106) 
and we define au 
(Q, x) = (33) (5.107) 


We assume that this relation can be inverted, so as to give Q = Q(¢@, x). 
Then, one defines the Legendre transform U (¢ġ, x) as 


O($,x) =U(Q,2) - Q¢, (5.108) 
where, on the right-hand side, Q = Q(¢, x). This definition is chosen so 
that 

dU = dU —Qdd¢-— ¢dQ 


oU ðU 
(30) 8" (cr) Qd- dQ. (5.109) 
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211 some conductors are kept at fixed 
electrostatic potential and some are iso- 
lated, so are at fixed charge, we per- 
form the Legendre transform only for 
those that are at fixed electrostatic po- 
tential, obtaining a function Ug of the 
variables ¢$1,...,¢n, Qi,---,;Qm, and 
X1,...,XN, where i = 1,...,n labels 
the conductors at fixed electrostatic po- 
tential and j = 1,...,m those at fixed 
charge (with n +m = N). For nota- 
tional simplicity, we just consider the 
case where all conductors are at fixed 
electrostatic potential. 


Using eq. (5.107), the terms proportional to dQ cancel, and we get 


dU = —Qdo+ (Z) dx. (5.110) 
dx} og 
We see that U is a function of ġ and x, such that 
au 
(3) = -qQ, (5.111) 
ae = ad : (5.112) 
Ox dx} og 
(a 


In this context, @ and Q are called “conjugate variables.” We now apply 
this procedure to a system of conductors. If all of them are kept at 
constant potential, we perform the Legendre transform with respect to 
all variables Qa, defining 


N 
Up(¢1,-.- Oni X1 -e XN) = Unli- QN X1- XN) — Y Qaba, 

“" (6.113) 
where, on the right-hand side, the charges Qa are expressed as functions 
of the electrostatic potential ¢,.21 For conductors, this relation is linear 
and invertible and is expressed by eq. (4.163). Note that the coefficients 
of capacitance Cap depend on the positions x,,...,xXy of all conductors. 

Similarly to eq. (5.110), using eq. (5.103) we get 


dUr 


N N 
de — X` badQa- D> Qada 
a=1 a=1 


N 
= J (Qadha — Fa: Axa) , (5.114) 
a=1 
and therefore N 
OUR 
Qa =- & f (5.115) 
ox 
and : 
oUg 
Fa, = — (2 ), : (5.116) 


The knowledge of Û g therefore allows us to obtain the total charge on 
the surface of the a-th conductor, using eq. (5.115), and the force exerted 
on it, using eq. (5.116). 

We now recall that the potential Ug can be written in the simple form 
(5.95), where ¢a = ba(Qi,-...QNn;X1,---,Xn) are given, in terms of the 
charges Q1,... Qu and of the positions x1,...,xy, by the inversion of 
eq. (4.163), 


N 
pa = > PaQis 


b=1 


(5.117) 
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as in eq. (4.164). Note, again, that the dependence on x,,...,xXy enters 
through the coefficients Cap, or through the coefficients P,, of the inverse 
matrix. Then, from eq. (5.113), performing the Legendre transform on 
Upg has the effect of just flipping the sign, 


N 
Da (dr,.--OwiX1s--.XN) = —5 X Qaba, (5.118) 
a=1 


where now the a are taken as the independent variables, and the Qa 
are expressed in terms of them, and of the xq, using eq. (4.163). In other 
words, 


Up(¢1,---@n;%1,---,Xw) = —Uz(Qi,. ..QnjX1,---,Xn), (5.119) 


where, on the right-hand side, the charges Q,,...@Qn must be expressed 
as functions of ¢1,...¢y and x,,...,xXy; or, equivalently, 


Cel Qiyess Qi X1,- --, Sy) = —Up($1,..- ỌN; X1,- -3 XN), (5.120) 


where, on the right-hand side, the electrostatic potentials ġ1,... y must 
be expressed as functions of Q1,... Qn and x1,...,Xy. Combining this 
with eq. (5.90), we see that 


Up =—Eg. (5.121) 


In conclusion, if the charges on the conductors are conserved, the appro- 
priate potential function to use is Ug. Using eq. (5.96), together with 
eq. (5.117), we can write 


N 
Un(Q1,---Qnix1,---,Xn) =5 Y Palxi -s XN)QaQo. (5-122) 


a,b=1 


The force acting on the a-th conductor can then be computed using 
eq. (5.105), which gives 


N 
1 OP ye 
A 7 2 b,c=1 ( OXa ). en oe 


If, instead, we fix the values of the electrostatic potentials a on the con- 
ductors, the appropriate potential function is Ug, given in eq. (5.118). 
Using eq. (4.163), we can write it as 


N 
Ûn(ói -$ni xn oxn) = -3 D> Canlers-.Xw)bado. (6.124) 


a,b=1 


Equation (5.116) then gives the force on the a-th conductor in terms 
of the electrostatic potentials ¢;,...@y assigned on the surfaces of the 


conductors, 
N 
1 OC be 
F,=- ee 12 
2 2 ( OXa J ve ip B) 


124 Electromagnetic energy 


22-This follows from the fact that C~!C 
is equal to the identity matrix J, and 
therefore 
0 = @(C~1C) 
= (A:07')C+C71(02C). 


Multiplying by C71 from the right, we 
get zO! = —-C-1(0,C)C71. 


Note, however, that the force acting on a conductor is an instantaneous 
property of the system, independent of whether the system was assem- 
bled keeping the charges fixed or the electrostatic potentials, so, in the 
end, eqs. (5.123) and (5.125) must give the same numerical result, in 
one case expressed in terms of the charges, and in the other in terms of 
the electrostatic potentials generated by these charges in the given con- 
figuration of conductors. This can be explicitly shown recalling, from 
eqs. (4.163) and (4.164) that, denoting by C the matrix with matrix 
elements Cab, and by P the matrix with matrix elements Pab, we have 
P = C~}. We denote by Q the (column) vector with components Qa 
and by ¢ the (column) vector with components ¢,. Then, in matrix 
form, eq. (4.163) reads 


Q=Co, (5.126) 
while eq. (5.123) reads 
Ip (oO 
r=- (Ja, (5.127) 


where QT is the transpose vector, i.e., the vector written as a row rather 
than as a column. From eq. (5.126), we also have QT = TOT, where CT 
is the transpose matrix. However, C is a symmetric matrix, Cab = Coa 
so, C = OT (we will prove this in Problem 5.2) and therefore QT = $7 C. 
We also use 0,C—' = —C-1(0,C)C—1.22 Then, 


F, = 


(5.128) 
which is just eq. (5.125), written in matrix form. 


5.5.2 Mechanical potentials in magnetostatics 


The force between two static current distributions was computed in Sec- 
tion 4.2.4, see eq. (4.118). We now want to find a mechanical potential 
function from which this magnetic force can be obtained. The procedure 
is analogous to that presented in Section 5.5.1 for the electrostatic case. 
Similarly to the discussion there, we consider two current distributions 
ji(x) and jo(x), localized in the non-overlapping volumes V; and Va, 
respectively (and, therefore, separately conserved). We use a coordinate 
x, to label rigid displacements of Vj, and x2 for V2. For instance, for 
two circular loops of currents, x; and x2 could be the coordinates of 
the centers of the two loops. We then denote by 50” (x) the first current 
density, in the reference frame where xı = 0, and similarly by j® (x) 
the second current density, in the reference frame where x2 = 0. Then, 
in a frame where the centers of the two loops have generic coordinates 
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xı and x9, respectively, we have 
ju(x) = 51 (x- x1), (5.129) 


and 
jo(x) = 52? (x — x2). (5.130) 


In this notation, the forces (4.118) and (4.120) read 


Ho «(0 «(0 x-x 
Fy (x1, X2) = -H f axata'sh (x -= xı)j$ aa = x2) jx — x|?’ 
(5.131) 
and [after renaming x + x’ in eq. (4.120)] 
Ho (0 «(0 x =x 
F2(x1, X2) = -H f axate' sf Mix — xı) j$ V(x’! = X2) Ix x|? ; 
(5.132) 
ie., F2(x1,X2) = —F1(x1,x2). Proceeding exactly as in Note 19 on 
page 116, we see that the desired potential function is 
+(0) (0) 7 
Sint _ _ Ho 3.43.1 ji (% — X1)-Jo (x — x2) 
OB (1,32) = — 2 | Pada’ ee (6.133) 


since the magnetic forces (5.131) and (5.132) can be obtained from 


Tint 
F,(x1,X2) = — & ; (5.134) 
x2,Jj 


Ox, 


and F2(x1, X2) = — (a0%3* /ax2) _. Rewriting eq. (5.133) in terms of 
X1,J 


ji(x) and jo(x), we have 


Ai Ho f a a y dalx) jax’) 
Üt = -—— | Prr = 5.135 
B (X1, X2) £ f TA T x- x] ( ) 
where the dependence on x; and x2 is now implicit in jı (x) and jo(x’), 
respectively, just as in the discussion in Section 5.5 for the electrostatic 
case. The reason why for this function we used the notation U?", rather 
than just U?*, will become clear soon. 
Exactly as we did in eqs. (5.83)—(5.89) for the electrostatic case, it is 
convenient to introduce a function 
>on Ho fog 3 u NE) 
Ugl] =-= | d rd’r =. 5.136 
sij= -E | dada (5.136) 
If j(x) = jı (x) + j2(x), where ji (x) and j2(x) are localized in two non- 
overlapping volumes Vı and Vz identified by coordinates xı and xə, 
respectively, eq. (5.136) becomes 


Up(x1, X2) = ow + oe + Tint (x1, xe), (5.137) 


where oft sag 
po = = / Padi’ (x) I") (5.138) 


T |x — x’ 
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(with a = 1,2), while the interaction term Ui* is given by eq. (5.135). 
Just as in the discussion following eq. (5.87), the self-energy terms U Q) 
and ue?) do not depend on x; and x2, so we can use Up(x1, X2) as the 
mechanical potential, in place of Ui?*(x,,x2). As in the electrostatic 
case, the form (5.136) of the mechanical potential is also valid for N 
charged objects with current densities ja(x) (a = 1,..., N), localized in 
non-overlapping volumes V, identified by the position xa, simply setting 
i(x) = Dy dal). 
From eqs. (5.53) and (5.136), we get 


Up =—E€p. (5.139) 


Comparing this to the electrostatic case, where eqs. (5.90) and (5.121) 
hold, we see that Ug is really the magnetostatic analog of Un, rather 
than of Ug, which is the reason why we used the notation U pg when we 
introduced it in eqs. (5.133) and (5.136). 

The reason is that, while the charge on an isolated conductor moving 
in an electric field is conserved, the current of an isolated loop moving 
in a magnetic field is not. If we want to build a configuration of charges 
at given positions, all the work that we do goes into the mechanical po- 
tential energy, and no work is needed to keep the charges at the initial 
value. Therefore, in this case, the energy of the system, defined as the 
work needed to assemble it, is the same as the mechanical potential en- 
ergy. As we saw in Section 5.5.1, this is not the case when we assemble a 
configuration of conductors with given values of the electrostatic poten- 
tials ġa at their surfaces, since the latter are not conserved quantity, and 
we must also provide some extra “electric” work, e.g., through batter- 
ies that keep the electrostatic potentials constants. The magnetostatic 
case at fixed currents that we are considering here is analogous to the 
electrostatic case at fixed ġa, rather than at fixed charges. To assemble 
a configuration of loops at given positions and with given currents, it is 
not sufficient to take a set of loops with the desired currents, at infinite 
distance from each other, and compute the mechanical work necessary 
to bring them to the desired final positions, as we did in Section 5.1 for 
a set of charges. We must also provide, along the way, further electric 
work to maintain the currents fixed, connecting the loops to batteries. 
Otherwise, as we move a loop in the magnetic field created by the others, 
an electric field is induced in the loop, as we saw in Section 4.3.2, and 
the corresponding “motional” electromotive force changes the value of 
the current in the loop. The net effect is that the final energy Eg of the 
system has the form Eg = Up + We, where We is the required electric 
work. Since we have found that Up = —€ pg, we conclude that the re- 
quired electric work is We = 2Eg. The same holds for the electrostatic 
case at fixed ġa. 

We now consider a set of current loops, and we ask what the magnetic 
analog of the potential Ug is. For a set of loops, with currents Ja and 
identified by the position x, (given, e.g., by the position of their centers, 
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for circular loops), using eqs. (5.61) and (5.139), we have 


N 
g 1 
Un(hy....Injx1,-...%w) = -3D IaB a. (5.140) 
a=1 


Here, the magnetic fluxes Ọga are expressed in terms of the currents 
through eq. (5.63).2° Consider now the function of the fluxes defined by 


N 
1 
Up(®pi,.-.,®B.n;X1,-.--,Xn) = 2 ) la®B a, (5.142) 
a=1 


where now the currents are expressed in terms of the fluxes by inverting 
eq. (5.63). It is clear that Ug is just the Legendre transform of Ug, with 
the currents Ia and the magnetic fluxes Ọ® g a playing the role of conjugate 
variables since, from the explicit expressions given in eqs. (5.140) and 
(5.142), we have 


Up(h,-..,In3x1,.--.xn) = Up(®pa,.-.,®p,n;X1,---,XN) 
N 
-X La®aa- (5.143) 
a=1 
Then, eq. (5.139) implies 
Up =Ep. (5.144) 


We see that the situation is exactly the same as in the electrostatic case, 
with the replacements Ug © Up, Un < Ûp, Pa © Ia and Qa © PBa- 
Then, proceeding exactly as we did in Section 5.5.1 for the electrostatic 
case, we now find that 


Up 
PBa ( OL, ; (5.145) 
rog 
and . 
OUR 
F,=- (z ) . ; (5.146) 


which are the analogues of eqs. (5.115) and (5.116). Therefore, the 
mechanical potential Ug can be used to obtain the force on the a-th 
loop for a system of loops at fixed currents, taking minus its gradient, 
while (minus) its derivative with respect to the current Ta, with all other 
currents fixed, gives the magnetic flux though the a-th loop. Using 
eq. (5.63), we can express Ug directly in terms of the currents, as 


N 
1 
Up(h,..Ivixi,-.-.*n) = -3 XO Lali. XN)al, (5.147) 


a,b=1 


and therefore the force on the a-th loop, in a set of loops at fixed currents, 


1S 
N 
1 o Lre 
F, = i I lgs 
2 D (= is i 


(5.148) 


23 Similarly to the electrostatic case, 
the dependence on the positions 
X1,...,Xy enter through the fact that, 
for given currents, the fluxes depend on 
the currents and on the positions of the 
loops; in particular, the dependence on 
the positions is carried by the mutual 
inductances Lap, so eq. (5.63), more 
precisely, reads 

Op 5a (I 1>-- 


-< In; X1,---;*N) 


N 
=) Lælxi, -xy . (5.141) 
b=1 
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24This is analogous to the assembling 
of a static charge configuration by car- 
rying elementary charged particles from 
infinity to the desired final position, 
studied in Section 5.1. We are there- 
fore working a fixed charge in the sense 
that no extra work is needed to ensure 
that the charges of the elementary par- 
ticles stay fixed. 


to be compared with eq. (5.125) for the case of conductors at fixed 
electrostatic potentials. The potential Ug satisfies instead 


dU 
a a aa = 
F, = - (32) (5.150) 
OXa Goad 


which are the analogues of eqs. (5.149) and (5.150). Denoting by L the 
matrix with elements Lap and by L~! the inverse matrix with matrix 
elements Le we have 


N 
=>) LB., (5.151) 
b=1 
1 N 
Ug(®B1,... ÖB N;X1;,.--, XN) = z 2 Ly (X1; ---;,XN)®B aB, 
(5.152) 
and K 
1 OL, 
F,=-- be Opp, Ppc. 1 


Exactly as we did in eqs. (5.126)—(5.128), we can then prove that the 
force computed using eq. (5.148) is the same as that computed using 
eq. (5.153), in one case expressed in terms of the currents, and in the 
other in terms of the magnetic fluxes generated by these currents in the 
given configuration of loops. 
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Problem 5.1. Energy stored in a capacitor 


In this chapter we have shown, in full generality, that the energy stored in 
the electric field of a system with given static charges is the same as the work 
that has been done (at fixed charge) to assemble the corresponding charge 
configuration. 

It is instructive to check this explicitly for a single capacitor made of two 
plates, computing on the one hand the work made to charge it, and on the 
other hand the electromagnetic energy stored in it. We consider an initial 
configuration in which the two plates are both uncharged, and we transfer 
elementary charges from one plate to the other so that, at the end, the two 
plates have charges Q and —Q, respectively (we take Q > 0).?4 Let q and —q 
be the value of the charges on the two plates at some stage of this charging 
process. From eq. (4.157), for a capacitor of capacitance C, when the charge 
on the positive plate is q, the potential difference between the plates has 
a value V = q/C. We now imagine to transport an infinitesimal positive 
charge dq from the negatively charged to the positively charged plate (more 
realistically, we would rather move the negatively charged electrons in the 


opposite direction, but the mathematics is the same), so that the charges on 
the two plates become —(q + dq) and q + dq, respectively. The work made 
by an external agent to carry this charge across a potential difference V is 
dWext = Vdq, so, using V = q/C, 


Q 
Wext = I dq V 
0 


Q 
q 
dq 4 
J ag 
Q? 
= +, 154 
2C ete) 
Using Q = CV, this can be rewritten in the equivalent forms Wext = (1/2)C y? 
or Wext = (1/2)QV. We can compare this result with that obtained from 
the general expression (3.41) for the energy of the electromagnetic field. For 
a parallel-plate capacitor, E is given by eq. (4.153) (while B = 0). Then 
eq. (3.41) gives 


Em = 2 | PrE? 


=- ee (5.155) 


where we used Q = cA and C = eg A/d, from eq. (4.158). Comparing this 
with eq. (5.154), we therefore see that the work done to assemble the static 
charge configuration of a parallel-plate capacitor is equal to the energy stored 
in its final electric field, in agreement with the general result. Note that we 
could have obtained eq. (5.155) without even using the explicit expressions for 
the electric field inside the parallel-plate capacitor and for its capacitance C, 
but just using eq. (5.44) or eq. (5.45), that give, in full generality, the energy 
stored in the electric field of a system of capacitors. For the single capacitor 
that we are considering, the matrix Ca» in eq. (5.44) becomes a single number 
C and the inverse matrix Pa» becomes the single number 1/C, so eq. (5.45) 
reduces to eq. (5.155). 

We can make the same check for a spherical capacitor. Inserting eq. (4.165) 
into eq. (3.41), 


Em = | PrE 

2 V 
€0 a Q? 1 

— 24 Gp = 
2 rf ne (4reo)? r4 

o 1Q&@ /1 ı 
2 4reo \b a 
1 

= 5QV, (5.156) 


where, in the last line, we used eq. (4.166). Using V = Q/C, we get again 
Eom = Q?/(2C), in agreement with eq. (5.154) (which was obtained indepen- 
dently of the plate’s geometry). 


5.6 Solved problems 129 


130 Electromagnetic energy 


Problem 5.2. Green’s reciprocity relation and properties of mutual 
capacitances 


Consider a charge distributions p;(x). The potential that it generates is 


_ 1 307 p(x’) 
ox) = I ao (5.157) 
Similarly, another charge distributions p2(x) generates a potential 
1 3 P(x’) 
= au ; ll 
h= fae ee (5.158) 
Then, 
i Ta araen 
d? = — | Prr Se ail 
f zpi (x)(x) = z f sd'a Sees (5.159) 
On the other hand, 
3 — 1 3 93,7 P2(x)p1(x’) 
fa zp) = a fa oda’ ee (5.160) 


which, upon renaming x © x’, is the same as the right-hand side of eq. (5.159). 
Therefore, we have the identity 


f ae pride) = f Pem, (5.161) 


which is valid for arbitrary charge densities pı (x) and p2(x), and the potentials 
¢1(x) and ¢2(x) that they generate. Equation (5.161) is known as Green’s 
reciprocity relation. A discrete version of this identity is obtained considering, 
as first charge distribution, a set of conductors with charges Q1,...,Qn. We 
denote by ¢1,...,¢@n the corresponding values of the electrostatic potentials 
on their surface. As the second charge distribution, we take the same set 
of conductors, in the same positions, but with charges Q{,...,Q‘y, and we 
denote by ¢/,...,@ the corresponding values of the electrostatic potential 
on their surface. Then, p1(x) = 5°, P1,a(X), where p1,a is non-vanishing only 
on the surface of the a-conductor, where $2(x) has the constant value ¢/,, and 
therefore, 


f Eea = Ya f Pema 


N 
= 5 Qa, - (5.162) 
=l 


Similarly, 
N D 
[ae p2(x)di(x) = 5 Qa I dz p2,a(X) 


N 
= edo. (5.163) 
=l 


Therefore, for a set of conductors, Green’s reciprocity relation becomes 


N N 
X Qupa =) Qupa. (5.164) 
a=1 a=1 


From eq. (4.163), we have Qa = ae Cardy and Q, = si Cab@n, with the 
same coefficients Cap, since these depend only on the geometry of the system, 
i.e., on the positions and shapes of the conductors, that we have taken to be 
the same in the two cases. Then eq. (5.164) gives 


N N 
5 Cardoen = > Cardia ; (5.165) 
a,b=1 a,b=1 


or, renaming a + b in the second sum, 


N N 
5 Carhatr = E> Crabadr . (5.166) 
a,b=1 a,b=1 


Since this is valid for arbitrary choices of the charge configurations Q and Q’, 
and therefore for arbitrary values of ¢ and ¢’, we conclude that 


Cab = Coa ; (5.167) 


i.e., the capacitance matrix is symmetric. We can repeat exactly the same 
argument in magnetostatics. In this case we get 


[ Erioa = f Erioa. (5.168) 


We can then apply it to a given configuration of loops. Let ,..., In be a set 
of values of the of currents in the loops, and ®g,1,...,®pg,n the corresponding 
magnetic fluxes; and let Ij,...,Iy be another set of values of the currents of 
the same loops, and © i, seis B,N the corresponding magnetic fluxes. Then, 


N N 
XO laba =Y Lee e: (5.169) 
a=1 a=1 


From this, we can prove in the same way as before that Lab = Loa. In this 
case, we already knew this from the explicit expression (5.66), while for Ca» 
there is not an equally simple general expression. 


Problem 5.3. Energy stored in a wire loop 


We now consider the work that should be done to create a current J in a 
loop. This is a special case of the discussion in Section 5.3 and allows us to 
illustrate that general analysis in a simpler setting. As in the discussion of 
Section 5.3, we raise the current in the loop from zero to the final value J. Even 
if the final value is constant in time, during the transient period necessary to 
reach the final value, J is a function of time, J = I(t). As a consequence, 
according to eq. (4.200), the magnetic flux ®g through the loop also changes 
in time, i P 

B 
a L ' (5.170) 
where L is the self-inductance of the loop. As already discussed after eq. (4.200), 
this creates a “back-emf” 


dö 
mti, (5.171) 


that opposes the increase in the current. An external agent must therefore 
provide work against this back-emf. From the definition of emf as the line 
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integral of E around the circuit, the work needed to push a charge dq through 
a single trip around the loop is 


dWexe = —E>25%dq 


II 
i 
a 
> 


(5.172) 


On the other hand, if we have a current J in the loop, the charge that moves 
around the circuit in a time dt is dq = Idt, where J is the current J that flows 


in the wire. Therefore, 
dWext _ ,,dI 


= LI—. 5.173 
dt dt 
With the initial condition Wext = 0 when J = 0, this integrates to 
Wext = SiP, 5.174 
or, using eq. (4.200), 
Wext = los. 5.175 


This is the work done to create the desired final steady current J in the circuit. 
As we see comparing with eq. (5.68), this is indeed the same as the energy 
Eg stored in the magnetic field generated by the current J, that we obtained 
starting from the general definition 


oo dx |B|’. (5.176) 
2/0 
It is important to appreciate that this energy, which is stored in the magnetic 
field and is associated with the self-induction L, is fully recoverable, for in- 
stance, turning the current back to zero. This should be contrasted with the 
energy dissipated in the wire because of its resistance R, which is irretrievably 
lost to Joule heating. 


Multipole expansion for 
static fields 


When a source is localized in a volume V of typical linear size d, and we 
are only interested in the field that it generates at distances r >> d, we 
expect, physically, that only the gross features of the source distribution 
will be important, rather than all its fine details. The tool that allows 
us to formalize this intuition is the multipole expansion. If the source is 
static, the multipole expansion is an expansion in powers of d/r. When 
the source is type-dependent a new parameter enters into play, which 
is the typical frequency w of the source, and this defines a new length- 
scale X = c/w. In this case the expansion is more complex, and even 
when r > d, still depends on the relative values of r and X. In this 
chapter we study the multipole expansion for static sources, both in 
electrostatics and in magnetostatics. This will allow us to introduce 
the static multipoles of the source. The time-dependent situation, that 
will lead to a multipole expansion of the electromagnetic radiation in 
terms of radiative multipoles, is more complicated and will be studied 
in Section 11.2, after we will have developed the formalism for dealing 
with time-dependent fields and electromagnetic radiation. 


6.1 Electric multipoles 


In eq. (4.16) we found the solution for the scalar gauge potential $(x) 
in the presence of a charge distribution p(x), taken to be static and lo- 
calized (or, at least, decreasing sufficiently rapidly at large distances, so 
that the integral in eq. (4.16) converges), but otherwise arbitrary. This 
solution is still given in terms of an integral, that cannot be computed 
analytically for a generic p(x). We now assume that the source is local- 
ized in a region of typical linear size d, and we study the solution for 
(x), and the corresponding electric field, in the limit r >> d. 

We begin by observing that, in eq. (4.16), x’ is an integration variable 
that in principle runs over all the three-dimensional space R3. However, 
since we have assumed that p(x’) = 0 for |x’| > d, in practice the integral 
in eq. (4.16) only runs over the values of x’ with |x’| < d. In the limit 
r >> d we can therefore expand |x — x’| for |x’| < |x|, i.e., in powers 
of the small parameter d/r. Before studying the full expansion, let us 
consider the first two non-trivial terms. These can be obtained defining 
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Fig. 6.1 A graphical illustration of 
the relation given in eq. (6.3). 


Observe that r is the distance of the 
point x from an origin of the reference 
frame, that we have chosen as an arbi- 
trary point inside the charge distribu- 
tion. We will later discuss the depen- 
dence of the multipole expansion from 
the choice of origin. 


Multipole expansion for static fields 


n as the unit vector in the direction of x, so that x = rn, and writing 


|x- x|? = 4? — Qriiew’ + |x]? 
2n-x’ d? 
= r? h a 0(5)| l (6.1) 
so that 
Aat d2 
x-x'| = rh-2#=-+0(5) (6.2) 
d2 
= r-âx+0(2) (6.3) 
r 


The first two terms of this expansion have a simple graphical interpre- 
tation shown in Fig. 6.1. From eq. (6.2) we find that 


eer 2 
ia, (6.4) 
Ix—x’| r r r2 
Inserting this into eq. (4.16) we get 
11 Ax! 2 
d(x) = = I da’ p(x’) 1+ 2% so l w 
Areo r r r2 


These are the first two terms of the so-called multipole expansion.! The 
first term, that in this context is also called the “monopole” term, is just 
the Coulomb potential (4.6), 


l q 
Pmonopolelr) = Are F ; (6.6) 
where 
a= f ax p(x). (6.7) 


is the total electric charge of the system. We define the dipole moment 
of the electric charge distribution, or, simply, the “electric dipole,” as 


d= fè p(x)x. (6.8) 
Then, the second term in eq. (6.5) can be written as 
ddipole(X) = iz 2a ; (6.9) 
or, equivalently, i, ae 
Pdipole(X) = Ta (6.10) 


Observe that, while monopole depends only on r, @dipole(X) depends 
both on r and on n. The corresponding electric fields are obtained from 


E = —Vọġ, since A = 0. Of course, dmonopole(7) generates the Coulomb 
electric field Egoulomb = 9t/(47e9r7). The electric field generated by 
the dipole is 


1 x 
Eaipole)i = --—4;ð, ( zi) 
(Eaipote) Ane ? r3 


e 
Anco 73 (3ninj — ôi dy (6.11) 


where we used eq. (6.10) and, to compute 0; on a function of r, we used 


ai f(r) = (Oir)df /dr, and 
ðr = O; (zza)? 


= ni. (6.12) 


We can rewrite eq. (6.11) in vector form, as 


1 3(dû)û-—d 


Eai ole = 
p Ate r3 


(6.13) 


Fig. 6.2 shows the field lines of the dipole electric field. We now work out 
the next term in the expansion. To systematically carry out the multi- 
pole expansion to higher orders it is convenient to write the expansion 
of 1/|x — x’|, for |x’| small os to |x|, in the form 

1 1 


x=x| 


1 
Poe zt z% 20:0; ~ eon (6.14) 


where, again, r = |x| and ð; is the derivative with respect to xt. The 
corresponding expansion of (x) in eq. (4.16) is 


Areg d(x) = L f Ex ox) - (a2) fate! oer 
+5 (085) S Ex oia (6.15) 


The first two terms give again the monopole and dipole terms computed 
previously. The third term can be further transformed observing that we 
are interested in the field at distances r much larger than the size of the 
region d where the source is localized, so, in particular, for r Æ 0. In this 
case, from eq. (130); V7(1/r) =0. Then, in the third ore of eq. (6.15), 
we can replace ja’, with the traceless combination xa’, — (1/3)d;;|x'|?, 
since 


1 1 i 1 1 
10105 — = («a = zös) 040} 7 + zix l z0: : (6.16) 


and, in the last term on the right-hand side of eq. (6.16), we can use 
6;,0;0;(1/ 1) = V?(1/r) = 0. We then define 


1 
dij = J zo) (ei; — souls?) 3 (6.17) 
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Fig. 6.2 The field lines of an electric 
dipole. 
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? Actually, there are different conven- 
tions in the literature. Our defini- 
tion agrees with Jackson (1998) (and 
with Landau and Lifschits (1975), that, 
however, denotes it as Dj;) while, for 
instance, Griffiths (2017) defines the 
quadrupole moment as 


1 : 
Qi = 5 f ëz (3xizj — 5i5 [x|") ; 
while Zangwill (2013) defines it as 


i 
Qi; = 3 f Er Litj: 


Fig. 6.3 An example of a distribu- 
tion of charges leading to a dipole 
moment (upper panel) and to a 
quadrupole moment (lower panel). 


which is called the “reduced quadrupole moment” of the charge distri- 
bution. The advantage of removing the trace part in qij is that, as we 
mentioned in Section 1.7.2 (and we showed explicitly for a tensor Ti; 
with two indices), symmetric trace-less tensors (or “STF” tensors, for 
“symmetric trace-free” ) provide irreducible representations of the rota- 
tion group. Working with the irreducible representations, we are directly 
working with the fundamental “building blocks” of the representation 
theory, so it is conceptually clearer, and this also typically brings tech- 
nical simplifications in various computations. 

While the normalization in (6.17) is the most natural from the point 
of view of STF tensors, for historical reasons the name “quadrupole 
moment” is usually reserved for the quantity Qij = 34:3,” 


Qij = J ëo) (Briz; — ðiz|x|’) - (6.18) 


The corresponding term in the potential is given by 


1 1 
Arreg Qquadrupole(X) = gtii 0,0; T 


1 
= =g (i — 8riny aig 


= grs aN 
1 


Putting together the terms up to the quadrupole, we therefore have 


1 q , nidi NinjQij 
(x) = l 2 + 2r3 Hjo (6.20) 


Areo |r T 


In Fig. 6.3 we show an example of a distribution of charges on the surface 
of a sphere, leading to an electric dipole moment (upper panel) or to 
an electric quadrupole moment (lower panel). The two different colors 
describe an excess of positive and negative charges, respectively. In all 
cases, the net charge is zero. In the configuration in the upper panel 
there is a positive net charge in the z > 0 hemisphere and a negative net 
charge in the z < 0 hemisphere. The charge distribution p(x) has been 
taken to be odd under z + —z. Then, the integrand p(x); in eq. (6.8) 
is even only when the index i = z, while for 2 = x or i = y the integration 
over d?x gives zero (we do not put the prime here over the integration 
variable). This leads to a dipole moment along the z axis, while dẹ and 
dy vanish. In the distribution shown in the lower panel, in contrast, 
p(x) = p(—x). Then, the dipole vanishes since the integrand p(x)x is 
odd under x + —x. However, this configuration has a non-vanishing 
quadrupole moment and, in this example, only Q,, and Qzx are non- 
zero. This can be understood observing that the charge distribution in 
the lower panel of Fig. 6.3 is odd under the parity transformation {£ > 


—2£,y > y,z — z} and also under the parity transformation {x > z, y > 
y,2 —> —z}. Therefore, the integral of p(x’)|x’|? in eq. (6.18) vanishes, 
because its integrand is odd under any of these parity transformations. 
The term p(x)x;x; is even under both, and therefore has a non-vanishing 
integral, only when i = 2,7 = z (or i = z,j = x), so, for the charge 
distribution shown in the right panel of Fig. 6.3, only the Qrz = Qzx 
component of the symmetric traceless tensor Qj; is non-vanishing. 

It should be observed that the expression for the multipole moments 
depends on the choice of the origin of the reference frame. If we perform 
a translation of the origin of the reference frame by a vector —s, a point P 
that before the transformation had the coordinate x will have coordinate 
x’ = x +s in the new frame, i.e., the coordinates transform as 


x>x’=x+s. (6.21) 


Under this transformation the charge density transforms as a “scalar 
under translations,” 3 

p(x) > Ø (x) = p(x). (6.22) 
Since also d3z is invariant under translation, d°x’ = dx, the total charge 
q of the system is independent of the choice of the origin of the reference 
frame, as of course we expect. However, the dipole moment transforms 
as 


(6.23) 


where q is the total charge. Therefore, the electric dipole moment is 
invariant under shifts of the origin only for a system with zero total 
charge. Similarly, from eq. (6.18), 


Qij => Qij +q (3sisj — õizls|?) H 3(disj H djsi) 26;;(s-d) , (6.24) 
and therefore it is invariant only if both q and d vanish. These transfor- 
mations are, indeed, precisely those required so that the full expansion 
(6.5), when carried out to all orders, is independent of the choice of the 
origin. This is as it should be, since the original expression that we 
are expanding, eq. (4.16), makes no reference to a choice of origin: as 
we have seen d?z’ and p(x’) are invariant under translations, and also 
x — x’ is invariant under translation. However, the latter property is 
apparently lost in the expansion (6.4), at least as long as we truncate it 
to a finite order. 

To see how the independence from the choice of the origin is recovered, 
thanks to the transformations of the multipole moments such as those 
in eqs. (6.23) and (6.24) for the dipole and quadrupole, we proceed as 
follows. Under the transformation (6.21), the coordinate of the point P 
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3This expresses the fact that the nu- 
merical value of the charge density at 
a given point P is the same indepen- 
dently of where we put the origin of a 
reference frame. In a frame with a given 
origin, the point P will be labeled by a 
coordinate x, while in a frame with a 
different origin it will be labeled by a 
coordinate x’ and, numerically, x’ Æ x. 
However, under this change of reference 
frame, the functional form of the charge 
density changes, from p(x) to a new 
function p’(x), such that the numerical 
value of p’ in x’ is the same as the nu- 
merical value of p in x, so that, in the 
end, the numerical value at the point 
P remains the same, independently of 
how P is labeled. 
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in which we are computing the field changes from x tox+s. Therefore, 
using eq. (6.4) and working to first order in s only, the term q/r in 
eq. (6.20) transforms as 


q q 


= 4 
r |Ix+s| 
q Ni 
= =-qsizt.... 6.25 
„Tia t (6.25) 


This extra term —qs;nj;/r? is precisely canceled by the transformation 
(6.23) of the dipole, in the term n,d;/r? in eq. (6.20). To second order 
in s, there will be a contribution from the q/r term in eq. (6.20), due 
to the expansion of eq. (6.25) to second order in s, as well as a further 
contribution from the term n;d;/r°, due to the expansion of 1/r? to first 
order in s. These are precisely the contributions that are canceled by 
the transformation (6.24) of the quadrupole. This continues to all or- 
ders, with cancellations among terms of different orders in the multipole 
expansion. In this way, the full expansion at all orders is independent 
of the choice of origin. Note, however, that any truncation of the ex- 
pansion to finite order (e.g., restricting to the monopole and dipole) will 
retain a dependence on the choice of origin. In practice, as long as we 
compute the field at a distance r from the localized charge distribution, 
much larger than the linear size d of the distribution itself, any choice 
of origin inside the localization volume of the charge will give basically 
equivalent results, with differences that disappear quickly as we go to 
larger distances (or include higher multipoles in the truncation). From 
the mathematical point of view, nothing forbids us from performing the 
expansion (6.5) choosing an origin outside, or even very far from the 
localization region of the charge; however, since in eq. (6.4) we are ex- 
panding in powers of x’, taken to be small with respect to x, it is clearly 
convenient to use a choice of origin such that, when p(x’) in eq. (6.5) 
is non-zero, x’ is as small as possible, which is obtained by choosing 
the origin somewhere inside the charge distribution (for instance, in the 
“center-of-charge,” the equivalent of the center of mass for the charge 
density). Otherwise, the consequence would be that a truncation of the 
expansion to any finite order would be a worse approximation to the 
exact result. 

The expansion in STF tensors is easily generalized to arbitrary order. 
Equation (6.14) becomes 


© 74) 
: 5 CU 1h Bb, Ôa. (6.26) 


k-x]| 
l=0 


Again, using V?(1/r) = 0 for r 4 0, we can remove the traces with 
respect to any pair of indices. We then define 


Papae I da! al... p(x’), (6.27) 


x! ) mean that we must take the trace-free 


where the brackets in Thi, pesin 
part of the symmetric tensor x; ... x; i.e., we must remove the traces 


with respect to all pair of indices. For instance, for the tensor with three 
indices, we can replace 7;2;2, with 


1 
ght? (dijte + Jind; + ÔjkTi) . 


The coefficients qj,...;, are called the STF multipole moments of the 
charge distribution. Then, inserting eq. (6.26) into eq. (4.16), we get 


~ toga i 
7 AT €9 2 gp Min iv in OF, C) . 


i=0 


(6.28) 


Lliljlk) = VjLjLk — 


(x) (6.29) 


The terms l = 0, l = 1, and l = 2 in eq. (6.29) are just the monopole, 
dipole, and quadrupole terms that we found previously.* 


6.2 Magnetic multipoles 


We now consider the multipole expansion in magnetostatics. We start 
from eq. (4.92), and we consider the situation in which the current j(x’) 
is localized in a region with |x’| < d. We then compute A(x) at dis- 
tances r >> d, where, as before, r = |x|. We limit ourselves to the first 
non-trivial term that, as we will see, is the magnetic dipole term. The 
generic expansion to all orders in STF tensors can be performed simi- 
larly to what we have done in the electric case, but one rarely encounters 
situations where magnetic multipoles higher than the dipole play a role. 

We therefore insert the expansion (6.4) into eq. (4.92) [technically, it is 
slightly simpler to perform the multipole expansion of A(x) and then ob- 
tain the corresponding expansion for B(x) from B = VxA, rather than 
performing the corresponding expansion directly in eq. (4.95)]. This 


gives 
ae d2 
=f L Sexi) i+ +0(5)]. 
TT r r 


The first term in the expansion actually vanishes, as a consequence of 
current conservation. This can be shown using the identity 0jzi = ðij 
to rewrite, trivially, 7;(x) = j;(x)0;7;. Then 


[Exi = f Paia 


= J dx £iðjjj (x) 
= 0, 


(6.30) 


(6.31) 


where in the second line we integrated by parts (setting the boundary 
terms to zero, because j(x) is localized) and we used current conservation 
in the form (4.90), valid for magnetostatics. The leading term is then 
given by the second term, which we denote as Agipole. In components, 


_ Ho Tj Z tala 
Eoi dea! ji(x' )x} , 


Ai, dipole (x) (6.32) 
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“The expansion in STF tensors can be 
rewritten as an expansion in spherical 
harmonics. We will not elaborate on 
this here. The interested reader can 
find a discussion of the relation between 
these two expansions, as well as the 
extension to vector spherical harmon- 
ics (relevant in electromagnetism) and 
tensor spherical harmonics (relevant for 
the gravitational field) in Section 3.5.2 
of Maggiore (2007). 
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5 Observe that, under a change of the 
origin used to define the multipole mo- 
ment, corresponding to changing x > 
x’ = x +s, we have j(x) > j/(x’) = 
j(x), i.e., the current density is invari- 
ant under translations, and d?z’! = dx, 
and therefore 


1 
mi > Mi + sears; | Baie. (6.35) 


However, because of eq. (6.31), for a lo- 
calized current density the extra term 
vanishes, and therefore, in magneto- 
statics (ie., when V-j = 0 holds, so 
that also eq. (6.31) holds) the magnetic 
dipole moment is independent of the 
choice of the origin. This is due to 
the fact that, in the expansion in mag- 
netic multipoles, there is no monopole 
term, i.e., no magnetic charge, so the 
situation is analogous to eq. (6.23) for 
the electric dipole, when q = 0. For 
magnetic multipoles, the dependence 
on the choice of the origin starts from 
the magnetic quadrupole (if the mag- 
netic dipole is non-vanishing, otherwise 
it starts at even higher orders). 


Fig. 6.4 The field lines of a magnetic 
dipole, represented here by a closed 
loop carrying a current. 


where we wrote nj as x;/r. To manipulate this expression, we proceed 
similarly to the previous computation, integrating by parts and using 
current conservation, 


| Penes 


I Pa (Opi) ju(x)ar; 
z - | Pazati: 
- = | Preiin(o)arc, 
z - f ajea. 


This shows that the integral is antisymmetric in the (i, j) indices, and 
therefore 


(6.33) 


/ Pany = ; J Pa E 
1 


= T 5 Sisk / Px [xxj(x)], 5 (6.34) 
where the last identity can be verified writing €ijk(XXJ)k = EijkEklmTljm 
and using eq. (1.7). We define the magnetic dipole moment of the current 
distribution (or, more simply, the “magnetic moment”), as 


1 . 
mas pee xx j(x). (6.36) 
Therefore 
J Pei; = —€EijkMk , (6.37) 
and eq. (6.32) becomes 
Ho mxx 
Adipole(X) = Ar wa (6.38) 


Taking the curl (for r 4 0, given that we are computing the magnetic 
field generated by a localized source, in an expansion for r >> d), we get 


Ho 1 
(Baipole)i = an r3 (38nin; ij) Mj ’ (6.39) 
or, in vector form, 
fo 3(m-1)n — m 
Baipole == ir r3 , (6.40) 


to be compared with eq. (6.13) for the electric field generated by an elec- 
tric dipole. This is the leading term for the magnetic field generated by 
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a localized current, at large distances. Fig. 6.4 shows the corresponding 
lines of the magnetic field, to be compared with Fig. 6.2 for the electric 
field from the electric dipole. 

A typical situations in which a magnetic dipole moment appears is 
given by a loop C carrying a current. Using eq. (4.103), eq. (6.36) be- 


comes 
I 
m=- xxde. 
2 Je 


For a loop lying in a plane, the integral in this expression is just twice 
the area A of the loop times the normal n to the surface, as we proved 
in eq. (1.47) using Stokes’s theorem,° so 


(6.41) 


m=IAn. (6.42) 


Another important case is given by a non-relativistic charged particle 
with charge qa, mass Ma and velocity Va. The corresponding current is 
given by eq. (3.27) which, inserted into eq. (6.36), gives” 

da 


m= 
2Ma 


La, (6.43) 


where La = MaXaXVa is the angular momentum of a non-relativistic 
particle.® 
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Equation (6.13) gives the electric field generated by an electric dipole, at 
distances r much larger than the size d of the source, since we obtained 
it performing an expansion in the small parameter d/r. Consider now 
as source a point-like electric dipole d located at the origin (which could 
be, for instance, a classical modelization of a microscopic object with an 
intrinsic dipole moment). In this case the size d of the source vanishes, 
and an expansion in powers of d/r gives the exact result for all values of 
r except r = 0, so eq. (6.13) is exact for all r 4 0. In principle, however, 
there could still be a Dirac delta singularity at the origin. To test for 
the presence of a Dirac delta in the electric field Eqipole generated by a 
point dipole, we integrate it over a volume V with boundary OV, 


[ve (Edipole)i = -f d? O;dbaipole 
y V 


= gs Ni Pdipole ; (6.44) 


OV 


where ds = d?sn is the surface element on the boundary OV, and ñ is 
the outer normal of the boundary. We now insert here the expression 
(6.9) for daipole, and we take V to be a sphere of radius R, so that 
d?s = R?dQ. Then, 


[ee (Edipole)i = = [Rann 
Vv ATE 


(6.45) 


6This can also be shown, in a less ele- 
gant but possibly more direct manner, 
just by explicitly computing the inte- 
gral for a square loop, setting the origin 
for instance in the center of the square 
and computing the separate contribu- 
tions of the four sides. The result for a 
generic planar loop is then obtained by 
filling it with infinitesimal square loops. 


Note, however, that in this case the 
current also depends on time and satis- 
fies the full continuity equation rather 
than V-j = 0, see eq. (3.30). There- 
fore, this example does not belong to 
the domain of magnetostatics and in 
this case eq. (6.31) does not hold. As 
a consequence, it is no longer true that 
the magnetic dipole is independent of 
the choice of the origin (compare with 
Note 5 on page 140). Indeed, we see 
from eq. (6.43) that, in this case, the 
magnetic dipole is given in terms of 
the orbital angular momentum, which 
is another quantity that depends on the 
choice of origin. 


8In quantum mechanics, particles carry 
an intrinsic angular moment called 
spin, usually denoted by S. One would 
then be tempted to assume that, beside 
having a magnetic moment associated 
with their angular momentum L, which 
indeed even in quantum mechanics is 
given by eq. (6.43), particles should also 
carry an intrinsic magnetic moment 
m = [qa/(2ma)]§S associated with spin. 
However, spin is a purely quantum con- 
cept, and the previous classical deriva- 
tion does not go through. It turns 
out that the actual intrinsic magnetic 
moment of a charged particle is rather 
given by m = galga/(2™ma)] Sa, where 
ga (usually written simply as g, when 
it is clear to which particle it refers) 
is a number that depends on the type 
of particle. For an elementary particle 
of spin 1/2, such as the electron, in a 
first approximation (given by the Dirac 
equation) g = 2, and quantum field the- 
ory gives corrections to this result that 
can be computed as an expansion in 
powers of the fine structure constant a, 
see e.g., Section 3.6 and Solved Prob- 
lem 7.2 of Maggiore (2005). 
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9 Actually, one should also observe that 
the radial integral diverges in r = 0. A 
more correct procedure is to regularize 
it by integrating from r =e tor = R, 
and in the end taking the limit for € > 
0+. However, since the angular integral 
vanishes identically, this limit is indeed 
Zero. 


The remaining angular integral could in principle be performed com- 
ponent by component, writing dQ = dcos@d@ and using the explicit 
expression of the unit normal vector in polar coordinates, 


h = (sin 8 cos ¢, sin @ sin ¢, cos 8) . (6.46) 


There is, however, a much more clever way of computing an integral 
such as 


dQ 
Ii; = J dee NN; , (6.47) 


based on the observation that ninj is a tensor under spatial rotation. 
After integration over the whole sphere with the measure dQ, which is 
invariant under rotations, the result must still be a tensor. However, 
after having integrated over all directions, there is no longer a preferred 
direction in space, so no vector on which the result could depend. Since 
the result is a symmetric tensor with two indices, the only possibility is 
that it is proportional to 6;;, with some proportionality constant «, 


dQ 
J? qe Nn; = Kij . (6.48) 


Taking the trace of both sides, we get f dQ/4r = 3x, and therefore 
k = 1/3. Thus, without even computing a single integral, we get the 


identity jo i 
Applying this to eq. (6.45), we get 
f da (Edipole)i = di. (6.50) 
v E 


This result, for a point dipole, is exact, because, to compute it, we only 
used the expression for ¢aipole ON a surface at r = R > 0, see eq. (6.44). 
In the case of a point-like source that we are considering, as long as r 4 0 
the exact result for the electric field is given by eq. (6.11). Allowing for 
a possibility of a Dirac delta at the origin, the most general form of the 
electric field generated by a point-like electric dipole is 


1 1 
(Eaipole)s = Ame E (3ninj — 6i3)d; 4 HPs] . (6.51) 


The constant vector «? can be fixed inserting eq. (6.51) into eq. (6.50). 
The first term in bracket in eq. (6.51) does not contribute to the integral. 
Tn fact, 


pea (Bnin; — 6;;)d; = af dr r? 5 fa (3min; — ij), 
(6.52) 


and the angular integral vanishes because of eq. (6.49).9 Then, eq. (6.50) 
gives KF = —(47/3)d;. The conclusion is that the electric field generated 
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by a point-like electric dipole d at the origin of the coordinate system is 
given by 


ee ill 3(d-n)n—d 
~ Arey r3 


An 
z Ad) (x)| . (6.53) 


We can proceed in exactly the same way to compute the magnetic field 
generated by a point-like magnetic dipole m (which could be, for in- 
stance, a classical modelization of an elementary particle with an intrin- 
sic magnetic moment, see Note 8 on page 141). Again, in the limit in 
which the size d of the source vanishes, an expansion in powers of d/r 
gives the exact result for all values of r 4 0, so eqs. (6.38) and (6.40) 
are exact for all r Æ 0. Just as we did for the electric field, to test for 
the presence of a Dirac delta in B we integrate it over a volume V with 
boundary OV, 


f| ee Bie) = con f d°x 0; A(x) 
V V 


= con | d?snjAx(x). (6.54) 
OV 


We insert here the expression (6.38) for A, and we take V to be a sphere 
of radius R, so that d?s = R?dQ. Then 


Ho Tm 
[ #2) = ES esjnerimm | R2dQn, R3 


dQ 
= Ho (518m — Bindu) fT nym (6.55) 


where we made use of eq. (1.7) and of the fact that, on the surface of the 
sphere, £m = Rn». The remaining angular integral is performed using 
eq. (6.49), so we eventually get 


1 
Í Pr B;(x) = Ho (0:9 jm = dimOjt)M1 30 
V 
2 


In the case of a point-like source that we are considering, for r 4 0, the 
exact result for the magnetic field is given by eq. (6.39). Allowing for a 
possibility of a Dirac delta at the origin, the most general form of the 
magnetic field generated by a point-like magnetic dipole is therefore 

Bi = m È (3ninj — 6:;)m; + KP 8®) | (6.57) 
and the constant vector x? can be fixed by comparing with eq. (6.56). 
The first term in bracket in eq. (6.57) has the same angular dependence 
as in the electric case and so does not contribute to the integral. Then, 
eq. (6.55) gives KP = (87/3)m;. The conclusion is that the magnetic 
field generated by a point-like magnetic dipole m is given by!® 


Anes | E m3) , (6.58) 


B=% 


An r3 


10 The term proportional to the Dirac 
delta in eq. (6.58) has an important ap- 
plication in quantum mechanics, where 
it contributes to the hyperfine splitting 
of the hydrogen atom. 


144 Multipole expansion for static fields 


6.4 Multipole expansion of interaction 
potentials 


We now study the interaction potentials associated with electric and 
magnetic multipoles. We consider electric multipoles in an external 
static electric field, as well as the interaction between the multipoles 
of two charge distributions. We will then repeat the analysis for mag- 
netic multipoles. 


6.4.1 Electric multipoles in external field 


As we found in eq. (5.90), in electrostatics the mechanical potential at 
fixed charges, Ug, is the same as the energy Eg stored in the electric 
field. We will then write the following equations in terms of Ug, which 
is the quantity that enters directly when computing the forces to which 
a given multipole moment is subject, or the forces between multipoles of 
different charge distributions, but it can be kept in mind that the same 
equations hold in terms of the energy Ep stored in the electric field. 

We consider first a charge distribution p(x) in an external electric 
field. We assume that p(x) is localized inside a region V that can be 
enclosed in a sphere of radius d. We also assume that the external 
potential is generated by charges localized in a region V’, at a distance 
r >> d from V (so, in particular, there is no overlap between V and V’). 
The condition r >> d implies that the external electrostatic potential 
dext(X) varies slowly across V. We then choose an origin inside V, so 
that the multipole moments are defined with respect to that origin, and 
we expand ¢ext(x) in a Taylor series around that origin. Denoting by 
(Ug)ext the mechanical potential at fixed charge in an external field, we 
have (Ug)ext = (Ex)ext and the Taylor expansion of $ext (X) in eq. (5.39) 
gives 


iz. = : Ped nO) + ipen (0) + 52D; box(0) +... 


= beO) fx o(x) + 2ido(0) [dx p(x) 


il e 
+5010; ext (0) dx p(x) xa? +. 


x 


= dext (0)q a Ei ext 


— 


Od- ABjexe(0) | depa +..., (6.59) 
where Fi ext = —0iQext is the external electric field. The first term is the 
“monopole interaction” energy, i.e., the interaction energy of the total 
charge q of the system with the external potential. The second term is 
the mechanical potential energy associated with an electric dipole in an 
external electric field, when the position of the volume V is identified 
by the value x = 0 of the coordinate of one of its points (and the 
multipoles are defined with respect to that point). If we perform a 
rigid displacement of the charge density, in the fixed external field, the 
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position x = 0 is replaced by a generic position x (with respect to which 
we still define the multipoles), and eq. (6.59) becomes 


! 1 ie 
(Up )ext(X) = (ext) Ey ext(X)—505E exe) | da’ p(x’ aa +... . 
v 


(6.60) 
The term associated with the dipole defines the potential (Ug) dipote(X), 


(UB )adipole(X) = —d-Eext(x) . (6.61) 


According to eq. (5.88), the mechanical force exerted on the dipole is 
therefore! 


Fk = diðk Ei ext . (6.62) 


Since, in electrostatics, VxE = 0, we have pEi ext = OiEk ext, and 
eq. (6.62) can also be written as 


Fk = dij Ek, ext , (6.63) 


or, in vector notation, 


(6.64) 


We can also use (Up )aipole to find the torque acting on a dipole. Consider 
a rotation of the dipole by 60 (where, writing 860 = ô0 ñ, n defines the 
direction of the axis around which we perform a rotation and ô0 the 
rotation angle) around the origin O with respect to which the multipoles 
are defined. Under this rotation the electric dipole moment changes by 


ôd = 50xd, (6.65) 


which is the transformation of a vector under infinitesimal rotations, see 
eq. (1.153). The corresponding change in (Ug)aipole is 


(Up )dipole —(ôd)-Eext 
= —(ô0 xd): Eext 


—(dxEext):ô0 . (6.66) 


Just as the force is obtained from a potential from eq. (5.88), i.e., from 
ôU = —F-6x, the torque N is obtained from!” 


ôU = -N-60 . (6.67) 
Therefore, the torque on a dipole in an external electric field is 
N = dx Eext. (6.68) 


This force tends to align the dipole so that it is parallel to the external 
electric field, so that the potential (6.61) is minimized. 

Note that this is the torque around the center of the dipole, i.e., in the 
frame where the dipole center has the position x = 0. In a frame with 
a different origin of the axes, in which the dipole center has a generic 


1l Note that the dipole moment d of the 
charge distribution does not change un- 
der rigid translations of the charge dis- 
tribution, since it is always defined with 
respect to the new, “translated” point 
chosen to identify the position of the 
volume V. The dependence on x enters 
only through a possible spatial depen- 
dence of the external electric field. If 
the external field is uniform, the force 
vanishes. 


Note that, to define the torque as in 
eq. (6.66), we compute how (UE)dipole 
changes when we rotate the dipole 
with respect to a fixed electric field. 
Given that (Ug)dipole is a scalar, if 
we would simply rotate the refer- 
ence frame, transforming both d and 
Eext accordingly, (Ug )dipole would not 
change and 6(Ug)dipole = 0. This is 
of course the same that we do when 
we define the force on an object from 
ôU = —F-.dx, where we consider the 
change in the potential when the posi- 
tion of the object changes, with respect 
to a fixed external field. 
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13\We now need to use the fact that, 
under rotations, a tensor Q;; with two 
indices transforms as 


Qij > Rix RjiQei, 


see eq. (1.130). For an infinitesimal ro- 
tation, we write the rotation matrix Rij 
as in eq. (1.152). This gives (taking into 
account that Qij = Qji) 


Qij > Qij — (€itm Qj F €j1mQ1i)OOm : 


Then, taking again into account that, 
for a static external field, O;Ej,ext = 
0; Ei ext; eq. (6.70) gives 


1 
(UE )quadr = 3 cil 215 Oj Bi,ext Om : 


Therefore, the quadrupole contribution 
to the torque is 
1 
Nm = — 3 mQ; Ei ext » (6.73) 
which, after renaming the indices as 
m —> i, i > k, l — j and j —> l, gives 
eq. (6.74). 


position x, in addition to this there will be a torque exerted by the force 
F in eq. (6.64), that will make the dipole rotate around the new origin, 
so the total torque is 

N = dx Eext + xx |(d-V)Eext] - (6.69) 
The second term vanishes if the external electric field is uniform, while 
the first is present even for a uniform field. 

The next term in eq. (6.59) involves the electric quadrupole. Actu- 
ally, the reduced quadrupole defined in eq. (6.17) also involves a term 
proportional to ĝ;j. However, Eext is defined as the electric generated 
by the external charges pex;, and therefore satisfies V-Eext = Pext/€0- 
Since the external charges are localized in the volume V’, which has no 
overlap with V, inside V we have V-Eext = 0; then 6;;, when contracted 
with 0;Ej,ext, gives zero. Therefore, we are free to add to the term x‘ x/ 
in eq. (6.59), the term proportional to ĝi; that completes the definition 
(6.17) of the reduces quadrupole moment. Also taking into account the 
factor of 3 in the relation between the reduced quadrupole moment qij 
and the quadrupole moment Qij, see eq. (6.18), we see that the energy 
associated with an electric quadrupole in an external electric field is 


(Uz) quadr(x) = — Qi O;Ej,ext(X) - (6.70) 


The force exerted on the quadrupole by the external electric field is 
obtained from Fk = —Ox (UE )quadr, so 


1 
Fk = g Quik Ej, ext = (6.71) 
Again, using the fact that, for static fields, ôk Ej ext = Oj Ek ext, We Can 
rewrite this as 


F= £13910 Bas : (6.72) 


Similarly, for the total torque on a quadrupole with respect to its origin, 
we get!’ 


ji 
N; = 3k 1, Ek,ext - (6.74) 


If we denote by Q- V the vector differential operator whose j-th compo- 
nent is Q;,0), we can write this in a more compact form as 


1 
N= z(Q V) x Eext . (6.75) 
Again, in a frame where the origin used to define the quadrupole moment 
is at a generic position x, rather than at x = 0, we must add to this the 
term xxF, where now F is given by eq. (6.72), and therefore 


1 1 
N= 3 (QV) x Eext + g>” (Qijôiðj Eext) - (6.76) 
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6.4.2 Interaction between the electric multipoles of 
two charge distributions 


A description in terms of interaction of the multipole moments of a 
charge distribution with a given external electric field is particularly 
appropriate when the external field is generated by a macroscopic ob- 
ject, and we study its interaction with a microscopic charge distribution. 
If, instead, we have two localized charge distributions of similar size, 
e.g., two molecules, interacting among them, a symmetric treatment of 
the two systems can be more appropriate. In this case, we start from 
eq. (5.75), that we rewrite here 


pint 3 3 , P1(X)p2(xX") p2(x ) 
uit =a [ee f r nae (6.77) 


We choose the origin O of the reference frame at some point inside 
the volume V, see Fig. 6.5; we will use this origin to define the mul- 
tipole moments of the charge distribution p1ı(x) (recall the discussion 
of eqs. (6.21)-(6.25) on the dependence of multipole moments from the 
choice of origin). We denote by O’ a fixed point inside the volume V”, 
and we will use this origin to define the multipole moments of the charge 
distribution p2(x). We denote the vector from O to O’ by r. A point 
inside the volume V can be labeled by a vector x starting from the ori- 
gin O. Similarly, a point inside V’ can be labeled by a vector y starting 
from O’. With respect to the origin O, the coordinate x’ of the latter 
point is given by x’ = r + y. Then, eq. (6.77) can be rewritten as 


uit (r) ha J Ër 1 pi(x p2(x J , (6.78) 


~ Areo lr- (x - y)| aex 


where r is fixed and y is a function of the integration variable x’, given 
by y = x'— r. We write it in this form, however, because we can now 
expand the denominator in the limit |x| < r, |y| < r (where r = |r|), 
corresponding to the fact that we are interested in the limit in which 
the linear sizes of the volumes V and V’ are much smaller than r. We 
then expand the denominator to second order, as in eq. (6.14), 


g _ i (x —y')ri 
ia) F Tr (6.79) 
1 : . . . 
uD (z — y')(a? — y) (3rirj — ôijr’) Pesas 


Then, collecting the various terms, 


(4reo) USt (r) = zf tzat f da! po(x’) (6.80) 


oe o 
x 
~ 


Fig. 6.5 Two charge distributions lo- 
calized on non-overlapping regions, 
and the coordinates and origins de- 
scribed in the text. 
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14 pquivalently, introducing the charge 
distribution pP (x) as in eq. (5.77), we 
have p2(x’) = ps?) (y) and, since dz’ = 
dy, eq. (6.82) reads 


a= faye. (683) 


In the first line, we see that the integrals give the total charges qı and 
q2 of the two charge distribution. In the second line, we recognize the 
electric dipole moment of the first charge distribution, 


di, =f dx p1(x)2*, (6.81) 
v 


defined with respect to the origin O, which is the natural choice for this 
charge distribution. Similarly, 


dy = | abe! palx’)y 


Í Px po(x')(x' —r)', (6.82) 


is the dipole moment of the second charge distribution, now defined with 
respect to the origin O’, which is the natural definition for this charge 
distribution.4 In the last line of eq. (6.80), the quantity 


fi dx py(x)a*at (6.84) 
V 


is the reduced quadrupole moment of the charge distribution p(x), 
again with respect to the origin O, except that the term proportional to 
ij in eq. (6.17) is missing. However, this expression is contracted with 
Irit; = 77655, and, using OT =3 


(3rirj — 776i3) ij = 0, (6.85) 


so we can add for free the missing term proportional to 6;; and recon- 
struct the full expression for the reduced quadrupole moment q;’. In the 
same way, 


f PANT, (6.86) 


is the reduced quadrupole moment of the charge distribution p2(x’), with 
respect to its natural origin O’ (again, apart from a term proportional 
ij, Which anyhow gives zero upon contraction). So, putting everything 
together, and writing qf = Q”’/3, we get 


. 1 
(4reo) UE) = SE + -gledir — der) (6.87) 
3rir; — 776i; [1 ij ij a4 i 
Sees re | (nay + 00?) - (dial vale] +... 


The first term, proportional to q1q2/r, is a “monopole-monopole” term, 
i.e., the Coulomb interaction between the total electric charges of the two 
charge distributions. If both qı and q2 are non-zero, it is the dominant 
term at large r. The second term, proportional to (q2di-r—qid2-r), gives 
the interaction between the charge of a distribution and the electric 
dipole of the other, and is therefore a “monopole-dipole” term. Note 
that, as all other terms, it is symmetric under the exchange of the two 


6.4 Multipole expansion of interaction potentials 149 


charge distribution, 1 < 2 (observing that, under such an exchange, 
r — —r). In the second line we have a “monopole-quadrupole” term, 
and a “dipole-dipole” term. As long as qı and q2 are non-zero, at large 
distances, where this expansion is valid, the monopole-monopole term, 
which is of order 1/r, dominates over the monopole-dipole term, which is 
of order 1/r?, and this in turn dominates over the monopole-quadrupole 
and dipole-dipole terms, which are of order 1/r?. However, if both 
localized charge distributions have an overall zero charge, q1 = q2 = 0, 
the dominant term becomes the dipole-dipole interaction, 


1 r? bij = 3rirj 
Ate 2r5 


Performing the contraction of indices and using tf = r/r, this gives 


(UE) dipole—dipole = (dias + did; ) : (6.88) 


1 dy-dy — 3(d,-%)(d2-t) 
ATE r3 i 


(Up )dipole—dipole = (6.89) 


Observe that the interaction between dipoles can be attractive or repul- 
sive, depending on the relative orientation of the dipoles and on their 
relative direction with respect to the vector r joining them. For instance, 
if the dipoles are orthogonal to r, so that dı- = d2-r = 0, the interac- 
tion is repulsive when the dipoles are parallel and attractive when they 
are antiparallel. If, instead, dı and də are aligned or antialigned with f, 
the interaction is attractive when the dipoles are parallel and repulsive 
when they are antiparallel. 

Equation (6.89) could have also been obtained more simply, consid- 
ering the interaction between the electric dipole of the second charge 
distribution with the external electric field created by the first. Accord- 
ing to eq. (6.61), this is 


(UB )dipole = —d2-E, . (6.90) 
We then substitute the electric field generated by the electric dipole 


of the first charge distribution at the position O’, that, according to 
eq. (6.13), is given by 


— 1 3(di-tj)t—di 
~ Arey r 


E: ; (6.91) 
and we get again eq. (6.89). The expansion (6.87), however, provides 
the cleanest way of understanding the structure of the expansion and 
computing all terms systematically. These results have been obtained 
from an expansion at large distances, compared to the size of the charge 
distributions. For the interaction among two point-like electric dipoles, 
we must also add the Dirac delta in eq. (6.53), and therefore 
1 dı-d2 — 3(d,-f)(d2-r) | 1 dy-d95®)(x), 
3€0 

(6.92) 
where r = Xə — xX is the relative distance between the two point-like 
electric dipoles. 


(UE) dipole—dipole = Areo r3 
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15 Although, as discussed in Note 5 on 
page 140, in magnetostatics the mag- 
netic dipole is independent of the choice 
of the origin, and the dependence only 
starts from the magnetic quadrupole, 
that we will not include here. 


6.4.3 Interactions of magnetic multipoles 


We next turn to the interactions involving the magnetic multipole mo- 
ments. To compute the mechanical forces, such as those due for instance 
to the interaction of a magnetic dipole with an external field, or between 
two magnetic dipoles, the most convenient quantity is the mechanical 
potential Ug introduced in eq. (5.136), since from it we can obtain the 
mechanical forces by taking spatial derivatives at fixed currents, as in 
eq. (5.146), and therefore keeping the magnetic moments fixed. How- 
ever, one should keep in mind that Ug = —Ep, see eq. (5.139), so the 
corresponding formulas for the magnetic energy have the opposite sign. 
In the following, we will work in terms of U p rather than Ep. 

We limit ourselves to the magnetic dipole term, since higher-order 
magnetic multipoles are rarely encountered in practical applications. We 
proceed similarly to what we did for the electric dipole. We now start 
from eq. (5.135), that we write in the form 


gist = — | de jx (x) Aa(x), (6.93) 
where, from eq. (4.92) 
_ Ho 37 _J2(x’) 
Ao(x) = = fa z kaa" (6.94) 


To stress that we consider the current distribution jg as an external 
source from the point of view of the current distribution j,, we change 
notation writing ji(x) = j(x), j2(x) = jext(x), A2(x) = Acxt(x) and 


we denote the interaction potential ue as (Up )ext, in analogy with the 
notation used in eq. (6.59) for the electrostatic case. Then 


Cias J Peba. (6.95) 


We assume that j(x) is localized in a finite volume V, and that Acxt(x) 
varies slowly across V. Similarly to what we did in eq. (6.59), we choose 
an origin inside V, that we use to define the multipole moments,!° and 
we expand Aext(X) around that origin. Then, to first order, eq. (6.95) 
becomes 


(Önden = —Aient(0) | dx Jil) — ApAiene(0) f Pojat 
(6.96) 
The first term vanishes because of eq. (6.31), while the second term is 
transformed using eq. (6.37). This gives 


(Up)ext = €iktOnAiext(O)mi +... 
= —€ikl ĝi Ák ext (Om Spc 
—B(0)mi +... , (6.97) 
and the first term in this expansion defines the interaction of a magnetic 


dipole with the external magnetic field, (U B)dipole. As in the electro- 
static case, we perform a rigid displacement of the current density, in 
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the fixed external field, so that the position x = 0 is replaced by a 
generic position x, with respect to which we still define the multipoles. 
Therefore, in vector form, 


(UB )aipole(X) = —m- Bext (x) . (6.98) 


Equation (6.98) can be compared to eq. (6.61) for the electric dipole. 
The computation of the force and torque on a magnetic dipole is then 
completely analogous to that performed for the electric dipole, taking 
into account that Ug (at fixed currents, and therefore at fixed magnetic 
moment) plays the role that Ug plays in electrostatics (at fixed charges). 
The force exerted on a magnetic dipole by an external magnetic field is 
obtained from eq. (5.146) and is 


Fp = MOK Bi ext , (6.99) 


to be compared to eq. (6.62). If the current jex; sourcing the external 
magnetic field has no overlap with the current distribution j that gives 
rise to the magnetic dipole m on which we are computing the force, 
then, from V xBext = Hojext it follows that V xBext = 0 in the region 
where we compute the force, and eq. (6.99) can also be written as Fẹ = 
m;ðiBk, ext Or, in vector notation, 


F = (m V)Bext, (6.100) 


to be compared to eq. (6.64). The torque acting on a magnetic dipole 
at the origin, due to an external magnetic field, is obtained exactly as 
in the derivation of eq. (6.68), and is 


N = mx Bext, (6.101) 


and tends to orient the magnetic dipole so that it aligns with the ex- 
ternal magnetic field, thereby minimizing (Û B)dipole. For a magnetic 
dipole located in a generic point x, we must add to this torque the term 
xxF, where F is given by eq. (6.99) [which, when VxBext = 0 in the 
region under consideration, can also be written as in eq. (6.100)], to be 
compared to eq. (6.69). 

Finally, we consider the interaction between the magnetic multipoles 
of two different current densities, similarly to the discussion in Sec- 
tion 6.4.2 for the electric case. However, in this case the “monopole” 
term is absent, and we are not interested in going beyond the magnetic 
dipole, since higher-order magnetic multipoles rarely appear. There- 
fore, the only interaction term that we need is the dipole-dipole term. 
Rather than performing the full expansion of Ue similarly to what we 
have done for U* in eqs. (6.80)—(6.87), it is then simpler to proceed as 
in eqs. (6.90) and (6.91): we write 


(Uz) dipole = —M2-Bi , (6.102) 


for the interaction potential between the magnetic dipole mə of the 
second current distribution and the magnetic field Bı generated by the 
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16The subscript a in Yq is unconven- 
tional, but helps to avoid confusion 
with the Lorentz y factor, that will in- 
stead appear in the formula for the fre- 
quency at which the position a charged 
particle rotates in a magnetic field, see 
eq. (8.201) in Solved Problem 8.4. 


magnetic moment of the first current distribution, and, for the latter, 
we use eq. (6.40), 


B wl 
= 3 (6.103) 
We then obtain 
` mı:mə — 3(m,-h)(mo-n 
(UB )dipole—dipole = - z o - 1 s ) (6.104) 


to be compared to eq. (6.89). Once again, for the interaction between 
two point-like magnetic dipoles, we must also add the Dirac delta in 
eq. (6.58), so that 


^ — Ho Mı :MmM2 — 3(mı-û)(mə2-û) 24o 
(UB )dipole—dipole = Tr 


m; m 6°) (r), 

(6.105) 
where r = Xə — x, is the relative distance between the two point-like 
magnetic dipoles, to be compared to eq. (6.92). 


r3 


6.5 Solved problems 


Problem 6.1. Larmor precession 


The torque gives the rate of change of the angular momentum L, as 


dL 
=—. 6.106 

FT (6.106) 
As an application, recall from eq. (6.43) that, for a non-relativistic particle 
with charge qa, mass Ma and angular momentum L, the magnetic dipole 


moment is m = (qa/2ma) L. We write, more generally, 
m= aL, (6.107) 


(so that this equation applies also to the spin angular momentum, for which 
the proportionality constant is different, see Note 8 on page 141).1° Equa- 
tions (6.101) and (6.106) then give 

dL 

— = 7LxBext , 6.108 

dt y t ( ) 
which give the evolution of the angular momentum (or, equivalently, of the 
magnetic moment) in an external magnetic field. Performing the scalar prod- 
uct of both sides of eq. (6.108) with L we get 


dL 
L-— =0 6.109 
L <0, (6.109) 
since L-(Lx Bext) = 0. This can be rewritten as 
1 d(L-L) 
2 = 11 
s~ 0, (6.110) 


and therefore the modulus |L| is constant. Similarly, multiplying eq. (6.108) 
by Bext and using Bext:-(L x Bext) = 0, we get Bext-dL/dt = 0. Therefore, also 


the component of L parallel to B is conserved, and only the components of L 
orthogonal to B change. Setting Bext = Bz, eq. (6.108) gives 


dLz 
Tt = —wzr Ly ; (6.111) 
dLy 
2Y = Le 112 
dt i (ene) 
dL, 

= 11 
Ai 0, (6.113) 

where 
WL = —Ya B (6.114) 


is called the Larmor frequency. When Ya = qa /2Ma, as for a classical particle 


of charge qa and mass Ma, 
qaB 


Ma 
The minus sign in the definition of wz is inserted so that, for electrons, where 
qa = —e < 0, we have wz > 0. The solution of eqs. (6.111) and (6.112) is 


(6.115) 


üp = 


Ly = 
Ly = 


| cos(wrt + p), (6.116) 


Li 
L, sin(wrt+y), (6.117) 


where Ly = (L2 + Ley? is the constant value of the projection of L on the 
(x,y) plane, and ¢ is a phase (that can be reabsorbed into a choice of the 
origin for t). So, the vector Lı = Lyx + Lyy rotates in the (x,y) plane at 
the Larmor frequency (counterclockwise, if wz > 0), while L, stays constant. 
This behavior is known as the Larmor precession. The factor ya, when Larmor 
precession is applied to the spin of an elementary particle with mass ma and 
charge qa, must be written as 


Jada 
ee 6.118 
Ya = Sm, ( ) 


where ga is the g-factor of the particle (see again Note 8 on page 141). The 
constant ya is called the gyromagnetic ratio of the particle. !" 
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I7Note that the name “evromagnetic 
ratio” is also often used instead for the 
dimensionless g-factor. 


Special Relativity 


We now introduce the basic postulates and the formalism of Special Rel- 
ativity. Special Relativity is one of the pillars of modern physics. Our 
presentation will be tuned toward understanding how Special Relativ- 
ity is hidden into Maxwell’s equations, and in building a “covariant” 
formalism that makes this symmetry explicit. 


7.1 The postulates 


To introduce the postulates of Special Relativity, we first need to define 
inertial frames. These are reference frames defined by the condition 
that, in these frames, a body on which no external force acts moves 
with constant speed v (constant both in modulus and in direction). 
The special theory of relativity, as formulated by Einstein in 1905, is 
based on two postulates: 


(1) Principle of relativity: the laws of nature are the same in all inertial 
frames. 


(2) Constancy of the speed of light: the speed of light has the same 
value in all inertial frames. 


Newtonian physics also has a relativity principle, that we now call 
Galilean Relativity, that again states that the laws of Newtonian physics 
are the same in all coordinate systems moving at uniform speed relative 
to one another.! Given two reference frames K, with coordinates (t, x), 


1From Galileo’s 1632 book Dialogue Concerning the Two Chief World Systems, (Sec- 
ond Day), translated by S. Drake, University of California Press, 1953 (taken from 
https://en.wikipedia. org/wiki/Galileo\/27s_ship.) “Shut yourself up with some 
friend in the main cabin below decks on some large ship, and have with you there 
some flies, butterflies, and other small flying animals. Have a large bowl of water with 
some fish in it; hang up a bottle that empties drop by drop into a wide vessel beneath 
it. With the ship standing still, observe carefully how the little animals fly with equal 
speed to all sides of the cabin. The fish swim indifferently in all directions; the drops 
fall into the vessel beneath; and, in throwing something to your friend, you need 
throw it no more strongly in one direction than another, the distances being equal; 
jumping with your feet together, you pass equal spaces in every direction. When you 
have observed all these things carefully (though doubtless when the ship is standing 
still everything must happen in this way), have the ship proceed with any speed you 
like, so long as the motion is uniform and not fluctuating this way and that. You 
will discover not the least change in all the effects named, nor could you tell from 
any of them whether the ship was moving or standing still. In jumping, you will pass 
on the floor the same spaces as before, nor will you make larger jumps toward the 
stern than toward the prow even though the ship is moving quite rapidly, despite the 
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20f course, different observers can use 
different origins for time, so that, in 
general t/ = t+tg. This reflects another 
invariance of Newtonian mechanics, in- 
variance under time translations, which 
is related to energy conservation. When 
one says that, in Newtonian mechan- 
ics, time is absolute, one means that 
time differences are the same for all ob- 
servers in relative uniform motion. 


and K’, with coordinates (t’,x’), such that the origin of A’ moves with 
respect of the origin of K with speed —vo, in Galilean Relativity the 
space and time coordinates of the two frames are related by 
t=t, x’ =x + Vot (1) 
(with a suitable choice for the origin in space and time, such that t = 0 
corresponds to t/ = 0 and, at t = 0, the point x = 0 corresponds to 
x’ = 0). The laws of Newtonian mechanics are invariant under these 
transformations. Note that, in Newtonian mechanics, time is absolute, 
and is the same in all reference frames, i.e., for all observers.” In Galilean 
Relativity, from eq. (7.1), if a particle in the frame K moves along the 
trajectory x(t), so that its velocity is v(t) = dx(t)/dt, in the frame Kk’ 
it will move on the trajectory x’(t) = x(t) + vot and will therefore have 
a velocity v(t) = dx'(t)/dt, such that 
v'(t) = v(t) + vo. (7.2) 
Thus, according to Galilean Relativity, if in the frame K the speed of 
a light beam traveling along the $ axis is v = cx, and the frame K’ 
is related to K by a velocity transformation (7.1) along the x axis, i.e., 
Vo = vox, then in the frame K’ the light beam should travel at the speed 
(c+ v9)x. Thus, the second postulate of Special Relativity marks the 
difference with Galilean Relativity and, as we will see more formally in 
the following, implies the end of the absolute notion of time. 

We now develop the mathematical consequences of the two postulates 
of Special Relativity. Consider two inertial frames: K, with coordinates 
x = (x,y,z), and K’, with coordinates x’ = (a', y’, z’). We denote by t 
time measured in the K frame and by t’ that in the K’ frame. We do 
not assume a priori t = t. The correct relation will emerge from the two 
basic postulates. Suppose that, in the frame K, a flash of light is emitted 
a time tı at the position (21, y1, 21) and is subsequently absorbed at time 
t2 at the position (x2, y2, 22). The fact that light moves at the speed c 


fact that during the time that you are in the air the floor under you will be going in 
a direction opposite to your jump. In throwing something to your companion, you 
will need no more force to get it to him whether he is in the direction of the bow or 
the stern, with yourself situated opposite. The droplets will fall as before into the 
vessel beneath without dropping toward the stern, although while the drops are in 
the air the ship runs many spans. The fish in their water will swim toward the front 
of their bowl with no more effort than toward the back, and will go with equal ease 
to bait placed anywhere around the edges of the bowl. Finally the butterflies and 
flies will continue their flights indifferently toward every side, nor will it ever happen 
that they are concentrated toward the stern, as if tired out from keeping up with the 
course of the ship, from which they will have been separated during long intervals by 
keeping themselves in the air. And if smoke is made by burning some incense, it will 
be seen going up in the form of a little cloud, remaining still and moving no more 
toward one side than the other. The cause of all these correspondences of effects is 
the fact that the ship’s motion is common to all the things contained in it, and to the 
air also. That is why I said you should be below decks; for if this took place above in 
the open air, which would not follow the course of the ship, more or less noticeable 
differences would be seen in some of the effects noted.” 


implies that 
(1 — 22)? + (y1 — yo)” + (a1 — 22)? — P (tı — te)? =0. (7.3) 


In the frame A’ light will be emitted a time t4 at the position (x4, y{, 24) 
and absorbed at time t4 at the position (a4, y5, 25). Since, according to 
the second postulate, also in K’ light propagates with the speed c, we 
have 

(21 — 23)? + (y1 — 92)? + (21 — 22)” — P (t1 — t2)? =0. (7.4) 


We define the interval s? between the two events as 


8? = —c? (ty — te)? + (£1 — £2)? + (y1 — yo)? + (21 — 22)”. 


(7.5) 
We have therefore found that the interval between two events related by 
light propagation is zero, in all inertial frames. Note that the interval 
between two arbitrary events in general will not be zero: for example, 
for events along the path of a particle moving in straight line at a speed 
v < c we have 


(£1 — £2)? + (y1 — yo)? + (21 — 22) = 07 (ti — te)’, (7.6) 
and therefore 
$ = (1 — 22)? + (yi — y2)? + (a — 22) — P(t — t2)” 
(v? = c7)(ty = ty)? <0. (7.7) 


We can distinguish three cases: 


(1) Light-like interval: s? = 0, as for the flash of light discussed above. 


(2) Time-like interval: s? < 0, as is eq. (7.7). Such events correspond 
to the motion of particles traveling at v < c. This is, in particular, 
the case for two events happening at the same point in space, at 
succesive values of time, i.e., at spatial separation Ax = 0, and 
to # ty. 

(3) Space-like interval: s? > 0. This is, for instance, the case of two 
events such that tı = t2 but x, Æ X2. Such events cannot be joined 
by the trajectory of a particle moving with speed v < c. We say 
that they are causally disconnected, because, as we will discuss 
in Section 7.2.2, the first event cannot influence the second event, 
and vice versa. 


The relation between the space-time coordinates (t,x) of K and the 
space-time coordinates (t’,x’) of K’ must therefore be such that, when 
the interval between two events is zero in K, it must also be zero in 
K'. We now show that, in fact, this relation must be such that, even 
for non-zero intervals, the interval must be the same in the two frames.’ 
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3We follow here Section 2 of the old 
classic Landau and Lifschits (1975). 
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4s we will see below, in the limit 
of velocities small with respect to the 
speed of light we recover the composi- 
tion of velocities of Special Relativity, 
vi3 = V12 + v23. However, we do not 
need to use this relation (that, as we 
will see, is not valid for generic veloci- 
ties), but only the fact that v13 depends 
on the angle between the vectors v12 
and v23, independently of the specific 
form of this dependence. 


To this purpose, it is convenient to work with infinitesimal intervals. In 
the K frame, the interval between an event at (t,2,y,z) and an event 
at (t + dt, x + dx, y + dy, z + dz) is 


ds? = —c?dt? + dx? + dy? + dz?. (7.8) 


In the frame K’, the two events will have coordinates (t, 7’, y’, 2’) and 
(t + dt’, x’ + dx',y' + dy',z' + dz’), and the interval between them is 
ds? = —c*dt'? + dx’? + dy’? + dz’”. Since ds? and ds’ are infinitesimals 
of the same order, we must have 


ds? = ads’, (7.9) 


for some coefficient a. Because of the invariance under spatial and tem- 
poral translations (i.e., of the fact that there is no privileged position 
in space nor a privileged origin of time) the coefficient a cannot depend 
on the value (t, x,y,z) of the first event that enters in ds? (nor of the 
coordinates of the second event, that, furthermore, only differ infinites- 
imally from the first), and therefore can only depend on the relative 
velocity v between the two frames K and K’. Furthermore, because of 
the invariance under rotations (i.e., the isotropy of space) it can actu- 
ally depend only on the modulus v = |v|. Consider now three reference 
frames Kı, K2, K3 and denote by vj2 the relative velocity of Kə with 
respect to Kı, by vig the relative velocity of K3 with respect to Ky, 
and by vo3 the relative velocity of K3 with respect to Kə. Similarly, we 
denote by ds?, ds3 and ds} the respective intervals. From eq. (7.9) we 
have 


ds} = a(vi2)ds} , ds3 = a(v13)ds; , ds = a(v23)ds3. (7.10) 
Combining these expressions we get 
a(v13) = a(v12)a(v23) . (7.11) 


However, v13 = |vı3| depends not only on v12 = |vı2| and on v23 = |v23l, 
but also on the angle between the vectors viz and v23. This angle does 
not appear on the right-hand side of eq. (7.11) and therefore the only 
possible solution of eq. (7.11) is that a does not depend on the velocity 
at all and is just a constant. Then, eq. (7.11) reduces to a? = a, which 
has the solutions a = 0,1. The solution a = 0 is clearly not acceptable, 
so we get a = 1. Thus, the relation between the coordinates (t,x) of 
K and the coordinates (t’,x’) of K’ must be such that, for all events 
(light-like, space-like, or time-like), 


d3? = ds". (7.12) 


From the equality of the infinitesimal intervals also follows the equality 
of the finite intervals, so s? = s’. In conclusion, from the two postulates 
it follows that the laws of Nature must be invariant under the transfor- 
mations that leave invariant the interval (7.5) between two events. 
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7.2 Space and time in Special Relativity 


7.2.1 Lorentz transformations 


We now identify the set of transformations that leave invariant the in- 
terval (7.5), which, without loss of generality, we can write as 
P=-PP+ar%ty +2, (7.13) 
having set t = 0 and x’ = 0. The set of transformations that leaves this 
expression invariant forms a group. As we saw in Section 1.7, the group 
that leaves the quadratic form x? + y? + 2? invariant is the rotation 
group in three dimensions, SO(3) (apart from the parity transforma- 
tions). Similarly, the group of transformations that leaves invariant the 
quadratic form (7.13) is called the Lorentz group,® and the correspond- 
ing transformation are called Lorentz transformations. First of all, we 
see that rotations of the spatial coordinates, i.e., transformations of the 
form 
tot =t, ti > x; = Rijtj, (7.14) 


where R is the rotation matrix that we introduced in Section 1.6, leave 
the interval (7.13) invariant, since they do not touch time and they trans- 
form the spatial coordinates in such a way that £? +y? + 2? is invariant. 
In three dimensions, the most general rotation can be expressed as a 
combination of a rotation around the z axis, i.e., in the (x,y) plane, 
a rotation around the xv axis (i.e., in the (y,z) plane) and a rotation 
around the y axis, so in the (x, z) plane. For instance, a rotation around 
the z axis has the form 


ca’ = xcosd—ysind, (7.15) 
yoy’ = «xsind+ycosé. (7.16) 


Since rotations form a group, they are a subgroup of the Lorentz group. 
It is convenient to introduce x° = ct, which has dimensions of length, just 
as the x’, and to define the four-vector x", with components (x°, x,y, z) 
(or, for uniformity of notation, (x°, «1, £, £’), so the “Lorentz index” u 
takes the values {0,1,2,3}). Just as we did for vectors, we will actually 
define four-vectors in terms of their transformations under the action of 
the Lorentz group. Let us begin by observing that, under a rotation in 


the (x,y) plane, z” — x’! = (2°, 2’, y’, z"), where 


z’ 1 0 0 0 x? 
x’ 0 cos@ —sin@d 0 x 
y’ ~ | O sinf cosů 0 y Ty) 
z 0 0 0 1 z 


We can similarly write all other rotations so, denoting by A the 4 x 4 
matrix of Lorentz transformations, and by R a generic 3 x 3 matrix 
describing a rotation, rotations are a special case of Lorentz transforma- 


5We will further refine the definition of 
the Lorentz group later, by eliminating 
the discrete parity transformations. 
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S Typically, when there will be no possi- 
bility of confusion, we will denote y(v) 
simply by y. 


tions, of the form 


110 0 0 

A=] 0 : (7.18) 
0 R 
0 


By analogy with rotations in a plane, it is also easy to find other trans- 
formations that leave the interval (7.13) invariant. We can consider for 
instance a transformation that does not act on y and z, and that leaves 
(2°)? — x? invariant. This has the form of a “hyperbolic rotation” 


z? >x = 2° cosh€+zsinh¢, 
as>a’ = gsinh¢+axcosh¢, (7.19) 


where Ç ranges in the interval —oo < Ç < +00 and (especially in a 
particle physics context) is called the rapidity. In matrix form 


g’? cosh sinh 0 0 x? 

av! _ {| sinh¢ cosh¢ 0 0 x 
y |= Ies i (7.20) 
z 0 0 0 1 z 


A transformation of the form (7.19) is called a Lorentz boost along the 
x axis. We can similarly perform a hyperbolic rotation in the (t, y) and 
in the (t,z) planes. Thus, we have found six independent transforma- 
tions that leaves the quadratic form (7.13) invariant, corresponding to 
three rotations and three hyperbolic rotations. We will see below that 
this exhausts the set of (proper) Lorentz transformations. First, let us 
understand the physical meaning of eq. (7.19). We introduce vo from 


vo = ctanh¢. (7.21) 


Since —oo < < +00, we have —c < vo < c. Then eq. (7.19) can be 
rewritten as 


xr? >x = (v9) G + 2a) ; (7.22) 
Ko vo 0 
z> = (v0) (« + oe ) ; (7.23) 


or, using t = x? /c instead of x°, 


tot = yv) (t + 32) l (7.24) 
c 
xr— x = y(vo)(z+ vot), (7.25) 
where we have introduced the “gamma factor” ê 
1 
Ce (7.26) 


J1— (w/e)? 


We see that, in the limit vo/c —> 0, eqs. (7.24) and (7.25) reduce to 
the transformation (7.1) of Galilean Relativity! Thus, from the point of 
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view of Special Relativity, the apparent validity of Galilean Relativity in 
Newtonian physics, and in everyday experience, is due to the fact that 
we usually deal with velocities which are very small compared to the 
speed of light. 

From eqs. (7.24) and (7.25) we can obtain the corresponding compo- 
sition of velocities. Consider a particle that, with respect to an observer 
that uses coordinates (t,2,y,z), moves with velocity v = (uz, vy, Uz). 
In this frame, in a time dt its coordinates will change by an amount 
dx = vdt, i.e., dx = v,dt, dy = vydt, and dz = v,dt. Then, with re- 
spect to an observer that uses coordinates (t’, x’, y’, 2’), with the x’ axis 
parallel to the x axis, it moves by an amount dz’ in a time dt’, where 


vo 
y(vo) (at + dx) ; 
y(vo)(dz + vodt) , 


d = 
dz’ = 


(7.27) 
(7.28) 


while dy’ = dy and dz’ = dz. 
dx/dt, we get 


From this, using v’ = dx'/dt’ and v = 


u = SS (7.29) 
c2 
v = ——e (7.30) 
(v0) (1 + 2°) 
/ Uz 
EEE. 7.31 
7(00) (1+ =) — 


In the limit c — oo (i.e., c much larger than all other velocities in the 
equations) we recover the Galilean composition of velocities, eq. (7.2). 
However, for generic velocities the composition is different. In particular, 
in the limiting case of a particle moving with the speed of light along the 
£ axis, Vz = C, Vy = Vz = 0, and a velocity transformation of parameter 
vo again along the x axis, we get v = c, v = v, = 0, independently 
of the value of vo! We have therefore recovered the fact that the speed 
of light is the same in all inertial frames, which was our starting point. 
Notice also that (unless vy = vz = 0), even the transverse components of 
the velocity change when performing a Lorentz boost, contrary to what 
happens in the Galilean transformation. This is due to the fact that we 
are also transforming the time variable. 


7.2.2 Causality and simultaneity 


Physically, the fact that Nature is invariant under Lorentz transforma- 
tions, rather than under the Galilean transformations of everyday expe- 
rience, introduces a revolution in our notions of space and time. This 
is seen in a particularly stunning way in the change of the concept of 
causality, as well as in the notion of simultaneity of events, as illustrated 
in Fig. 7.1. In this plot, on the vertical axis we display 2° = ct and 
on the horizontal axis one spatial coordinate, say x, while y and z are 
suppressed for graphical reasons. In this plot, light rays travel at 45°, 
corresponding to the fact that the interval between two events connected 


Fig. 7.1 The past and future light 
cones of the observer located at 
the origin (boundaries of the gray 
shaded areas). The white regions 
are causally disconnected from the 
observer at the origin. For the ob- 
server using the (t,x) coordinates, 
the x axis corresponds to simulta- 
neous events, all characterized by 
t =0. For the boosted observer K’, 
the simultaneous events correspond 
to the line labeled t = 0. 
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Fig. 7.2 A three-dimensional ren- 
dering of the past and future light- 
cones. 


by a light ray has s? = 0, i.e., |Ax| = tcAt. We see that an observer 
located at the origin x = 0 can receive signals, traveling at a speed 
less or equal than the speed of light, only from the t < 0 part of the 
shaded region in the figure; its boundary is called the past light cone of 
that observer (if one makes the plot in three dimensions, with one more 
spatial coordinate, it is indeed a cone with the tip at the origin, see 
Fig. 7.2). The region where this observer can send signals is the shaded 
region of Fig. 7.1 with t > 0, or the corresponding region in Fig. 7.2; its 
boundary is called the future light cone. The white regions to the left 
and to the right in Fig. 7.1 are causally disconnected from the observer 
at the origin: the events that fall in these region cannot be influenced by 
anything that happens at (t = 0,x = 0); and, vice versa, nothing that 
happens there can influence the events at (t = 0,x = 0). Only events 
in or inside its past light cone can influence the events at (t = 0,x = 0) 
and, conversely, what happens in (t = 0,x = 0) can only influence the 
events in, or inside, its future light cone. 

Related to this, the notion of simultaneity of the events is also rel- 
ative to the observer considered. In the reference frame K, that uses 
coordinates (t,x), the events with the same value of t are simultane- 
ous. In Fig. 7.1, simultaneous events are along lines parallel to the x 
axis, and the events that take place at t = 0 are those along the x 
axis in the figure. However, for the observer K’, that uses coordinate 
(t’, x’) related to (t,x) by the Lorentz boost (7.24-7.25), the events are 
simultaneous if they have the same value of t’. For instance, the events 
with tł = 0 correspond, according to eq. (7.24), to the events on the 
straight line ct = —(vo/c)a. Since the boost parameter vo is in the 
range —c < vo < c, these are lines comprised between ct = —a and 
ct = +a (with the limiting lines excluded), i.e., contained in the white 
regions causally disconnected from the origin; an example is given by 
the line shown in the figure. So, simultaneity is no longer an “absolute” 
concept, but is relative to the observer. 

We see from Fig. 7.1 that, whenever two events are causally discon- 
nected, we can find a boosted reference frame such that, in this frame, 
the two events become simultaneous, since we can always find a straight 
line ct = ax, with —1 < a < 1, that joins the origin with a point in 
the white region. Conversely, when two points are causally connected, 
i.e., one is on (or inside) the past or in the future light cone of the 
other, this is not possible. This is graphically clear from the figure and 
can also be seen more formally as follows. Consider two events that, 
in the frame K, have coordinates (t,,x1) and (t2,x2); let (t4,x{) and 
(t4,x5) be their coordinates in the boosted frame K’. If the events 
are simultaneous in K’, we have t} = t, and therefore the interval 
sB = —(t, — th)? + (x, — x5)? > 0. However, the interval is invari- 
ant under Lorentz transformations, so we must also have s?, > 0. This 
is just the condition that (setting the first event at the origin) the second 
event is in the white, causally disconnected, area of Fig. 7.1. 
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7.2.3 Proper time and time dilatation 


We next define the proper time T of an observer (or, e.g., of a particle). 
Suppose that, with respect to a given inertial observer K, a second 
observer O moves with velocity v(t), with u(t) = |v(t)| strictly smaller 
than c. We do not need to assume that v is constant, i.e., the frame 
moving with O need not be an inertial frame. We want to understand the 
relation between the time t measured by the clock of the inertial observer 
K, and the time 7 measured by a clock moving with O. To this purpose, 
we consider an inertial reference frame K’ such that, as some time t, O 
and K’ have the same velocity, i.e., the frame K’ is (instantaneously) 
comoving with O. We can imagine that at time t the observer O emits 
a first signal and at time t + dt it emits a second signal. Each of these 
signals marks an event, and we can compute the infinitesimal interval 
between these two events. In the frame K, during the time interval dt, 
the observer O has moved by dx = v(t)dt. Therefore, the corresponding 
interval between the two events is 


d3? = —c'dé? + dx? = -edt [1 — v7 (t)/e"|, (7.32) 


where v(t) = |v(t)|. In the inertial frame K’, in contrast, the observer O 
is instantaneously at rest so, to linear order in dt, dx = 0. Then, calling 
dr the time interval measured by a clock carried by the observer O, to 
lowest order in the infinitesimal quantity dt the interval measured in the 
inertial frame K’ is 


ds? = —c*dr?. (7.33) 
Since the intervals measured in two inertial frames must be the same, 
we get 
2(t 
dr = dt\/1— a) 
c 
dt 
= =; (7.34) 
q(v) 


where 7(v) was defined in eq. (7.26). This relation can be integrated 
(which, physically, means that we are using a succession of comoving 
inertial frames) so that, choosing the origin of times so that t = to 
corresponds to T = To, we get 


T(t) — To = [ dt’ 4/1 — ae (7.35) 


The quantity 7 is called the proper time of the observer O. It is the 
time measured by the clock carried by this observer. Note that, since 
\/1 —v?/c? is always smaller than one, dr is always smaller than dt. 
From the point of view of the observer K, the clock carried by O goes 
slower. This is the famous phenomenon of time dilatation of Special 
Relativity. 

This apparently leads to a paradox. Suppose that O actually moves 
with constant velocity v, so that now also the frame moving with O is an 
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TTo be precise, this second clock, that 
at the beginning is at the same posi- 
tion as the first, and with the same 
zero velocity with respect to K, must 
have been brought to the second posi- 
tion very gently, i.e., giving to it a negli- 
gible acceleration at the beginning, and 
a negligible deceleration to eventually 
stop it in the final position. This is in 
principle always possible, at the level of 
these “gedanken” experiments. Accel- 
eration and deceleration indeed affect 
the reading of a clock, as one learns in 
General Relativity. 


8 Alternatively, the clock in K’ might 
invert its motion and come back to 
meet the clock in K again; this how- 
ever introduces extra complications due 
to the corresponding phase of accelera- 
tion, so K’ is no longer inertial. In the 
context of General Relativity, this pro- 
duces another apparent paradox called 
the Twin Paradox, on which we will not 
dwell here. 


inertial frame, that coincides with K’ at all times. Then, from the point 
of view of an observer in K, the clock carried by the inertial observer in 
K' goes slower but, exactly by the same reasoning, the inertial observer 
K’ will rather find that the clock carried by K goes slower! 

In fact, there is no logical contradiction with this and, rather, this 
apparent paradox is at the core of the notion of “Relativity.” This 
can be understood by specifying more carefully what should be done, 
operationally, to compare the two clocks. Suppose that, at some initial 
time, the clocks in K and that in K’ are together at the same point in 
space and are both set to the same initial value of time, say t = t’ = 0. To 
determine what the clock in K’ measures at a subsequent time, compared 
to a clock in K, we need a second clock, at rest with respect to K and 
located in a second position, that the observer in K must have previously 
synchronized with the first clock. That is, these two clocks belonging to 
K have been first carried to the same place, where it has been checked 
that they both read the same time, and then one has been brought to 
the second position.’ When the clock carried by the observer in K’ will 
have reached the position of this second clock, a comparison can be 
performed. However, now the situation is no longer symmetric between 
the two frames. We are comparing one clock in K’ with two clocks in K. 
The clock that goes slower is the one that is compared with two clocks 
of the other frame.’ 

We can also consider a more symmetric situation, in which each ob- 
server prepares two clocks, synchronizing them in his/her frame, and 
uses them to compare with a clock of the other frame. Again, each 
observer will find that the other observer’s clock goes slower; the clock 
that goes slower is always the one that is checked against two clocks of 
the other frame. Observe also that, for the observer K’, the two clocks 
in K are not synchronized! We indeed see from Fig. 7.1 that, if in K 
a clock is at (t = 0,2 = 0) and another is at (t = 0,2 = xo), for some 
xo # 0 (and therefore in K the two clocks are synchronized, since they 
both register the same value of time, in this case t = 0), from the point 
of view of a boosted observer K’ they will not be synchronized. For K’, 
synchronized clocks are those that, in this space-time diagram, can be 
found on a line such as the t’ = 0 line shown in Fig. 7.1. So, for instance, 
the inertial observer K could use two clocks, synchronize them (from his 
point of view), place them at two different positions, and use them to 
compare with one clock of K’, and he would find that the clock of K’ 
goes slower. The observer K’ could do exactly the same, preparing two 
clocks synchronized in her frame, and use them to check a clock of K. 
Again, she would find that the clock in K goes slower. The observer 
K would attribute this different result to the fact that K’ had made a 
mistake: from his point of view, the two clocks used by K’ were not 
correctly synchronized. The observer K’ would reach the same conclu- 
sion: the fault was in the fact that the clocks in K were not correctly 
synchronized! In fact, both observers were right, and simply there is no 
“absolute” notion of which clock goes slower. The fact that a moving 
clock goes slower is a correct statement, relative to the (two) clocks of 
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the observer that sees that clock in motion. This is one instance of the 
fact that some statement that, in Newtonian physics, have an absolute 
validity (a clock either goes slower than another or it does not), in Spe- 
cial Relativity can have a validity only relative to some observer (hence, 
the name “Relativity” given to the theory). 


7.2.4 Lorentz contraction 


In a similar way we can prove that, given two inertial observers K and 
K’, the length of a rigid rod depends on the velocity at which the ob- 
server sees it moving. Consider first the frame K, where the rod is at 
rest along the x axis. If we call Z its length in this frame, the coordinates 
of its two endpoints can be taken, respectively, as (a1, 0,0) and (x2, 0,0), 
with z2 — x, = £. Inverting eqs. (7.24) and (7.25), the relation between 
the coordinates (t,x) in K and the coordinates (t', x’) in K’ is 


t 
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(7.36) 
xz = (v0) (z — vot). (7.37) 


It is straightforward to explicitly check that this provides the inversion 
of eqs. (7.24) and (7.25), but in fact the result can be obtained much 
more simply by reversing the sign of vg. In the frame K’, the bar moves 
with velocity vo along the x axis, and, at a given time t’, the position of 
its end-points will be (2/,0,0) and (25,0,0), respectively. The observer 
in K’ will define the length of the bar as the difference in the position of 
its end-points, «5 — x1, measured at the same value of her time variable 
t. From eq. (7.37), we have 


Ly = —— t+ rt", 7.38 
2 Tk (7.38) 
1 
l = ——_g, + vt! 7.39 
Tı me (7.39) 
and therefore 1 
r4- z= Lo — gi] 7.40 
2 1 (v9) ( 2 1) ( ) 


Therefore, the length ¢’ = (x4 — x1), measured in a frame where the rod 
moves with velocity vo, is related to the length £ in the frame where the 
rod is at rest, by 


= 41i- 2. (7.41) 


This is the Lorentz contraction of lengths. Note that the contraction only 
takes place in the direction of motion. The coordinates of the transverse 
directions, for a Lorentz boost along the x axis, satisfy y’ = y and 7’ = z, 
so transverse directions are not affected. 
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°More precisely, as we will see be- 
low, this transformation property de- 
fines “contravariant” four vectors, to 
be distinguished from “covariant” four- 
vectors that will be introduced in Sec- 
tion 7.3.2. 


7.3 The mathematics of the Lorentz group 


7.3.1 Four-vectors and Lorentz tensors 


We now introduce a covariant formalism, that will make the transforma- 
tion properties of the various quantities under Lorentz transformations 
explicit. In the case of rotations, in Section 1.7.2 we have defined the 
rotation group as the group of linear transformations (1.140), in a space 
with d spatial dimensions, which leave the quadratic form (1.141) in- 
variant. We have seen that this implies that R is an orthogonal matrix. 
Vectors were then defined as objects that transform according to the 
“fundamental” representation, vj + v; = Rijvj, i.e., with the same ma- 
trix Rij used in the definition itself of the group. It was also useful to 
introduce a “metric tensor” 6;;, so that the scalar product between two 
vectors is given by v-w = ĝ;jv;wj, so in particular the squared norm of 
a vector v is |v|? = ĝijvivj. 

Similarly, after eq. (7.13) we have defined the Lorentz group as the 
group of linear transformation of a four-dimensional space, with coordi- 


nates (x°, £t, x°, x’), which leaves invariant the quadratic form 


gs = —(e") + (2) + (2? EES (7.42) 


Generalizing eq. (1.140), we now write such linear transformations with 
the notation 


gt > r" = AM a”, (7.43) 


where the “Lorentz index” pu takes the values 0,1,2,3 and, again, the 
sum over repeated indices is understood. Note, however, that, contrary 
to the case of the rotation group, we are now careful about the position- 
ing on the indices, and the sum is always performed by contracting an 
upper and a lower index. The reason for this convention will become 
apparent in the following. 

Equation (7.43) is the transformation law that is used to define the 
Lorentz group and therefore, just as for vectors in the case of rotations, 
can also be used to introduce the “fundamental” representation of the 
Lorentz group, that we call the four-vector representation: four-vectors 
are defined as any set of four quantities (V°, Vt, V?, V3), for, with an 
equivalent notation, (V°,V*,V¥, V~)], collectively denoted as V“, that, 
under Lorentz transformations, transform linearly among them, accord- 
ing to? 


Vi av’ =A". (7.44) 


For instance, for a Lorentz boost along the x axis with velocity vo, we 
saw in eqs. (7.22) and (7.23) that 


ga = (v9)(x° + Be), (7.45) 
ga’ = y(vo)(x+ Bx°). (7.46) 
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where, for the parameter of the transformation, we have introduced the 
notation B = vo/c and 6 = |6|. Then, for a generic four-vector V”, 


V? =V = (v0) (V? + BV") , 
V? >V" = (v0) (V% + BV°) , 


(7.47) 
(7.48) 


(where, of course, the notation V° for the u = 0 component of the 
four-vector V” should not be confused with the velocity vo of the trans- 
formation), while V’” = V” and V’* = V*. For a boost in a generic 
direction, we can write!® 


Vo > V” = (v9) (V° + 6-V) , (7.49) 
Vi —> Vi =7x(vo) (Vj +8V°), (7.50) 
Vi > Vi = Vi 5 (7.51) 


where we have split V = (V”, V”, V?) into its components parallel and 
perpendicular to 6, 


V=ViB+V_. (7.52) 


The four-vector representation has dimension four and is irreducible 
since, with rotations, we can mix among them all the spatial compo- 
nents Vt, while with boosts we can mix V° with any of the spatial 
components. 

Similarly to what we have done for the rotation group, after having 
defined the four-vector representation, we can proceed to define tensor 
representations of the Lorentz group (that we will call “Lorentz tensors,” 
or, when the context is clear, simply “tensors”). For instance, a tensor 
T!” with two upper indices (or a “contravariant” tensor) is defined as 
an object that, under Lorentz transformations, changes as 


POE at AM NY Eee 5 (7.53) 
and similarly for tensors with three or more Lorentz indices. 
We next introduce the Minkowski metric! 
tjuv = diag(—1,1,1,1). (7.54) 


This plays a role analogous to the metric 6;; in the case of rotations, 
in the sense that it allows us to define the scalar product between two 
four-vectors V” and W*", as 

(VW) = Nuy VEW” ; (7.55) 
so that the squared norm of a four-vector V” is V? = nu V”“V”. Notice 
that this scalar product is not positive definite, and V? can be positive, 
negative, or zero. The infinitesimal interval (7.8) can then be rewritten 
as 


ds? = quydx"da” . (7.56) 


10th Section 7.3.5 we will show that the 
spatial components (V*,V¥,V*) of a 
four-vector V” transform as a vector 
under rotations, so we already use the 
notation V = (V*,V¥,V7%). 


Tag first, we introduce nuy as a fixed 
matrix, given by eq. (7.54) in all refer- 
ence frames. More precisely, we will see 
in Section 7.3.3 that it is actually an 
invariant tensor of the Lorentz group, 
i.e., a tensor that keeps the same nu- 
merical value in all frames related by 
Lorentz transformations, similarly to 
the metric ĝ;j for rotations, as we saw 
in eq. (1.132). 
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2 More generally, in the group the- 
ory language developed in Section 1.7, 
the group of transformations of a 


space with coordinates (y1,... Ym, 
1,...%), which leaves invariant the 
quadratic form 

s? = (yi +... +¥m) + (ei +..-+22), 


is called the orthogonal group O(n, m) 
for, equivalently, O(m,n)], and it re- 
duces to O(n) if m = 0. Thus, 
in three spatial dimensions, the group 
that leaves invariant the quadratic form 
s? = —(x?)? + £? +y? + 2? is called 
O(3, 1). However, in O(n,m) there are 
both transformations with determinant 
+1 and with determinant —1. The 
transformations with determinant +1 
form a subgroup, which is denoted by 
SO(n,m). Thus, the “proper” Lorentz 
group, which is defined by eliminating 
the discrete parity transformations, is 
actually SO(3,1). When we will refer 
to the Lorentz group, we will hence- 
forth always mean the proper Lorentz 
group SO(3, 1), similarly to our restric- 
tion from O(3) to SO(3) for rotations. 
This can be generalized to arbitrary 
spatial dimensions. Just as SO(d) is 
the group of (proper) rotations in a 
four-dimensional space, SO(d,1) is the 
Lorentz group in a space-time with d 
spatial dimensions and a time-like co- 
ordinate. 


In eq. (1.129) we saw that the condition that rotations must preserve 
the quadratic form (1.141) restricts R,; to orthogonal matrices. We 
now derive the analogous condition on A“,. Writing 2/" = A” pz’ and 
requiring that 

(7.57) 


Te a = Nv the” : 
we get 
Nuu (A px? )(A” ox”) = Nuvo r” : (7.58) 
By renaming the dummy indices 4 — p,v — o on the right-hand side, 
and rearranging the factors, we get 


ugh? A" ae? =i ee". (7.59) 
Since this must hold for x generic, we must have 
Nuv A" AY o = Jpg" (7.60) 


This is the analogous of the condition (1.129) for the rotation group. 
In matrix notation, eq. (7.60) can be rewritten as 


ATA =n, (7.61) 


where (A7)," = A”, is the transpose matrix. Taking the determinant 
of both sides, we get (det A)? = 1, and therefore det A = +1. Trans- 
formations with det A = —1 can always be written as the product of 
a transformation with det A = 1 and of a discrete transformation that 
reverses the sign of an odd number of coordinates, e.g., a parity trans- 
formation (t,2,y,z) —> (t,-2,—y,—z), or a reflection around a single 
spatial axis, such as (t, x,y,z) > (t,-2,y,2), or a time-reversal trans- 
formation, (t, x,y,z) > (—t, x,y,z). Transformations with det A = +1 
are called proper Lorentz transformations.'? 


7.3.2 Contravariant and covariant quantities 


From the metric nuy and a contravariant four-vector V”, we can form a 
set of four quantities V,, with lower index, defined by 


VS maV a (7.62) 
called a covariant four-vector. Explicitly, 
V=-V°, Y,=+V'*. (7.63) 


It is also convenient to define a matrix 7”, with both upper indices, 
whose numerical values are still the same as for nuv, i.e., 


nt’ = diag(—1,1,1,1). (7.64) 


With our convention that Lorentz indices are summed over by contract- 
ing an upper and a lower index, we can use 7” to invert eq. (7.62), 
writing 


VE = n” V, (7.65) 
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since this gives V° = —Vo and V? = +Vj, which is the (obvious) inversion 
of eq. (7.63). So, nuy can be used to lower the index of a contravariant 
four-vector, obtaining a covariant one; and, vice versa, 7” can be used 
to raise the index of a covariant four-vector, obtaining a contravariant 
four-vector. 

Consider now the combination nuon”. 
identity matrix. We denote it by 07, 


Numerically, this is just the 


Op = Tuon” 
diag(1, 1, il; 1) 2 


(7.66) 


where the position of the indices on ôy (one upper and one lower) 
matches the position in n,n”. Observe that we have obvious iden- 
tities such as V” = ô} V”. 

In terms of a covariant and a contravariant four-vector, the scalar 
product (7.55) can then be rewritten as 


(VW) = V W” =V*W,. (7.67) 


Explicitly, 


V W” = VW? + VW! + VW? + VW? , (7.68) 


so, using a covariant and a contravariant four-vector, the scalar product 
takes a Euclidean form, with all plus signs. 

By definition, a contravariant four-vector V# is an object that trans- 
forms as in eq. (7.44). From eq. (7.62), it then follows that the corre- 
sponding covariant four-vector transforms as 


Vai = No V? 
> NyoA? pV? 
= Nod? pn” Vy . (7.69) 
and therefore 
Vai > V; = Ay” Va (7.70) 
where 
Ay” = Nuo AT pe? : (7.71) 


The matrices A,,” and A”, are different: because of the 7, involved 
in the transformation, some of their matrix elements differ by a minus 
sign. In particular, A% = Ag’, N°; = = Ao, Ato = =A’, and At; = 
A,’. However, physically V” and V,, represents the same quantity in 
a different notation and, in the language of representation theory, they 
correspond to equivalent representations, related as in eq. (1.137), with 
Nv Playing the role of the matrix S.1% 

Similarly, we can use nuy to lower one or more indices of a tensor. For 
instance, given a contravariant tensor with two indices T””, defined by 
the fact that it transforms as in eq. (7.53), we can define a covariant 
tensor Ty as 


Tu= NupNval ?? (7.72) 


13 More abstractly, one can define con- 
travariant four-vectors as object V" 
that transform as in eq. (7.44), and co- 
variant four-vectors as objects W, that 
transform as 


Wp Wh = Ap” Wv, 


with A,,” defined by eq. (7.71). At 
this point, however, one would discover 
that, given any contravariant four- 
vector V¥, the quantity Vy = Nuy V” is 
a covariant four-vector and vice versa, 
given any covariant four-vector Wp, 
the quantity W” = nt’W, is a con- 
travariant four-vector. Therefore, the 
spaces of covariant and contravariant 
four-vectors are in one-to-one corre- 
spondence, so we do not lose gener- 
ality by defining contravariant four- 
vectors starting from covariant four- 
vectors and lowering their indices. The 
same holds for the covariant and con- 
travariant tensors that we now intro- 
duce. 
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147) get eq. (7.80) we multiply both 
sides of eq. (7.60) by n7*A®%,. This 
gives 


Nv A" p (AoA a) = npon "AP a : 


(7.76) 
The right-hand side can be rewritten as 
pat ON = 6a 
= AP, 
= nuvn’ A” p , 


where we used Nun’? = ag. There- 
fore, writing nuvA¥p = Avp on both 
sides of eq. (7.76), 


Avp (AnA a) = Avon”? . 
(7.77) 
Since Avp is an invertible matrix, we 
can factorize it out from this equation, 
and we get 


MoM =a (7.78) 
or, renaming the indices as o > p, a > 
o, v > p and B > v, 

PAA = bY, (7.79) 
Lowering the u, v indices on both sides, 
and inverting the upper/lower position 
of the contracted p, ø indices, we finally 
obtain eq. (7.80). 


Proceeding as in eq. (7.69) we see that it transforms as 
Tea? pea Ag hy Togs (7.73) 


We can also define tensors with mixed covariant and contravariant in- 
dices. For instance, defining 


LP. = Hoel (7.74) 
we find that it transforms as 
Th A Ay Pegs (7.75) 


We can proceed in the same way for tensors with three or more indices. 
For later use, we observe that, in terms of A,,”, the condition (7.60) 
can be written as! 


toeN y hy = Nv - (7.80) 


This is similar to eq. (7.60), except that now, on the left-hand side, the 
contraction of the two indices of 7 is made with the second indices of 
each of the two A matrices, rather than with the first indices. 


7.3.3 Invariant tensors of the Lorentz group 


The notation n,,, for the metric (7.54), with two lower Lorentz indices, 
implies that it is a covariant tensor. However, 7,,, is a special type of 
contravariant tensor, that retains the same numerical value of its com- 
ponents in all frames connected by Lorentz transformations, i.e., is an 
invariant tensor of the Lorentz group [just as we found that 6;; is an 
invariant tensor of the rotation group, see eq. (1.132)]. Indeed, consider 
a covariant tensor T, whose components, in a given frame, are given 
numerically by Tay = Ny» = diag(—1,1,1,1). After a Lorentz transfor- 
mation, this tensor becomes Tiy = Af Af Npc. However, because of the 
defining property of the Lorentz group, written in the form (7.80), the 
right-hand side of this equation is just nay again, so Thy, = nav. Thus, 
Nv is an invariant tensor with two lower indices. 

Similarly, using eq. (7.79), we see that 7“” is an invariant tensor with 
two upper indices. The same holds for 64 that, being constructed from 
Nup and n°” as in eq. (7.66), is also an invariant tensor. It is important, 
however, to understand that the identity matrix is an invariant tensor 
only if we define it with an upper and a lower index. In this way it 
transforms so that, if it is equal to diag(1,1,1,1) in a frame, it remains 
equal to diag(1, 1,1,1) in any other frame related by a Lorentz transfor- 
mation. We have derived this from the fact that it is constructed with 
Np and n°”, and we proved that the latter are invariant tensors, but we 
can also show it directly from its transformation property 


SL — MA, 78, (7.81) 
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and using eq. (7.60) (with the indices properly raised and lowered). 
Note that the positioning of the Lorentz indices in the two A factors 
in eq. (7.81) is the one appropriate to the transformation of a tensor 
with one upper and one lower index. If, in contrast, we consider a ten- 
sor with two lower indices Ty, whose numerical value in a reference 
frame happens to be diag(1,1,1,1), and we perform a Lorentz trans- 
formation transforming it as in (7.73), as appropriate for a tensor with 
two lower indices, in the new frame T „y will have different numerical 
values, and will no longer be of the form diag(1, 1,1,1). So, for instance, 
it makes no sense to define an identity matrix with two lower indices, 
“Suv = diag(1,1,1,1).” Such an object is not an invariant tensor, and 
the numerical assignment diag(1,1,1,1) could only hold in one Lorentz 
frame (and in those related to it by a spatial rotation) but would change 
as soon as we perform a boost. Notice that, lowering the upper index of 
ov, we get Nv, 

Np = Nw (7.82) 


and similarly raising an index of 7,,, we get ôk, 
jP No = OF (7.83) 


Just as there is no meaning in writing “d,,, = diag(1,1,1,1),” there is 
no meaning in writing “n”, = diag(—1,1,1,1).” Again, a tensor of the 
form diag(—1,1,1,1) maintains the same numerical value in any frame 
only if it is transformed according to the transformation law of a tensor 
with two lower indices (or with two upper indices), not with one upper 
and one lower index. 

The only other invariant tensor of the Lorentz group (apart from all 
possible lowering of its indices, see below) is the totally antisymmetric 
tensor e“”°?. This tensor vanishes if two indices take the same value, 
satisfies €°1?3 = +1, and changes sign under permutations of any two 
indices; so, for instance, repeatedly switching the position of the 0 index, 
e1023 — —1, «103 — +1, and e1230 = —1 so, in this case, it changes 
sign under a cyclic permutation. Note, however, that starting from 
¢!230 — —1 and making three jumps for the index 1, we get ¢299! = +1, 
so in this case a cyclic permutation of 0123 gives again +1 instead of —1. 
Therefore, the tensor €“”?? is neither cyclic nor anti-cyclic (in contrast, 
for the rotation group in three dimensions, €)* is cyclic, since it is again 
antisymmetric, but it has only three indices, so, e.g. €1?3 = —e?/8 = 
+e731). Observe that 6%" = etk, 

The fact that «“”?" is an invariant tensor follows from the fact that, 
from the definition of the determinant of a 4 x 4 matrix, 


AM AY yw AP py AP gP = (detA) , (7.84) 


and, for the (proper) Lorentz group, det A = 1. Combining e4”°? with 
the metric tensor we can lower some of its indices. In particular, €yypo 
is still totally antisymmetric, while mixed combinations such as €,,”?7 = 
Nuu €} YP? are not. 
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15 Explicitly, 
Noo = Nuv (85 at wt p) (ôg F wo) 
= Nea + Nuvåh w” o F Nuvo" p 
+O(w?). 
Therefore, to linear order in w, 


0 = Nv 5 w” o + 1554" p 


Novo” o + Now" p 
= WpatWwop- 
Observe that this generalizes to the 


Lorentz group the result that we found 
for spatial rotations in eq. (1.151). 


7.3.4 Infinitesimal Lorentz transformations 


With the formalism that we have developed, we can now compute how 
many independent parameters there are in a Lorentz transformation. 
This is conveniently done restricting to infinitesimal transformations. 
For the Lorentz group, the identity transformation is given by A”, = 64, 
since, on any vector V#, we have in this case V4 > AM, V” = 6bVY = 
V”. A transformation infinitesimally close to the identity can then be 
written as 


AH, = d# +0", (7.85) 


where w”, are infinitesimal of first order, that describe the deviation 
of A“, from the identity transformation. Note the positioning of the 
indices in A“, and in w”,, with the lower index in the second position; 
as we will see in a moment, it is important to keep track of it, since 
it will turn out that the matrix w,, is not symmetric. Plugging this 
into eq. (7.60), neglecting terms quadratic in the infinitesimal quantity 
wt „, and raising and lowering the Lorentz indices according to the rules 
discussed in this section, we get! 


Wuv = —Wyy - (7.86) 
An antisymmetric 4 x 4 matrix has six independent elements, so the 
Lorentz group has six parameters. These are the three angles and the 
three rapidities, corresponding to the three independent rotations and 
the three independent boosts that we found by inspection in Section 7.2. 
Thus, the angles and rapidities associated with rotations and boosts, 
respectively, exhaust the parameters associated with Lorentz transfor- 
mations. 


7.3.5 Decomposition of a Lorentz tensor under 
rotations 


Since four-vectors and four-tensors have well defined transformation 
properties under the Lorentz group, in particular they also have well 
defined transformation properties under spatial rotations, since these 
are a subgroup of the Lorentz group. In this subsection we explore this 
connection in more detail. 

Rotations are a particular case of Lorentz transformations, with A”, 
of the form (7.18) so, in components, 

yet, Nee Ae, Af = Rọ}, (7.87) 

where R’ j is the rotation matrix. Note that we now keep one index 
upper and one lower also on R‘;. However, for the rotation group the 
spatial indices could be raised and lowered with the Kronecker delta, 
that can be written as ĝi or as ĝ;ij, and we could keep all indices lower 
(or upper), as indeed we have done in Section 1.6. 

Consider first the Lorentz transformation of a four-vector V”, given 
by eq. (7.44). Since the matrix A in eq. (7.18) is in a block-diagonal 


7.3 The mathematics of the Lorentz group 173 


form, under rotations V° does not mix with V’, and 
vay, Vi > R; Vi. (7.88) 


Thus, under rotations, V° is a scalar while Vt is a vector. According to 
the discussion in Section 1.7.1, the fact that V? and V’ never mix under 
rotations is expressed in the language of group theory by saying that 
V” provides a reducible representation of the rotation group; in other 
words, it is made by separate “building blocks” (here V° and V*) that 
do not mix among them under any rotation. However, under boosts V° 
and V* mix, so four-vectors are an irreducible representation of the full 
Lorentz group. 

Let us now consider the transformation of a tensor T#” under rota- 
tions. From eqs. (7.53) and (7.87), we find that 


T? > AD NT =T, (7.89) 


since, for rotations, A°; = 0 and A?ọ = 1. This means that T°° is a 
scalar under rotations. Similarly, 


T” > AON TPE = RiT , (7.90) 


which is the transformation law of a spatial vector (and the same for 
T’°), while 7 E o 
TY > AAT? = Rt, RI,T™ , (7.91) 


and therefore is a spatial tensor. Recalling that a spatial tensor T’ 
further decomposes into irreducible representations of the rotation group 
as in eq. (1.148) we see that, from the point of view of spatial rotations, 
the 16 components of a Lorentz tensor T“” decompose into two scalars 
(T°, and S = 6;;T”, see eq. (1.148)], three vectors (T™, T® and A’), 
and a traceless symmetric tensor Si. The counting of degrees of freedom 
of course matches, 4 x 4=1+1+3+3+3+5. 

Observe also that the trace of T”” in the four-dimensional sense, T = 
Nuvi”, is a Lorentz invariant quantity, and therefore is invariant (i.e., 
a scalar) also under rotations. Writing T = nooT®° + 6;;T7, we see 
that T is related to the two scalars under rotations that we have found 
above, T° and S, by T = —T°° + S. Note that T° and S are scalars 
under rotations but are not Lorentz scalars. For instance, T° is the 
(00) component of a Lorentz tensor. Only their combination —T° + $ 
is a Lorentz scalar. 


7.3.6 Covariant transformations of fields 


In classical electrodynamics the fundamental variables are fields, i.e., 
dynamical quantities that depend not only on time, as the typical vari- 
ables q;(t) of an elementary mechanical system, but also on space. For 
instance, the scalar potential ¢(t, x), the vector potential A(t, x), or the 
electric and magnetic fields E(t, x) and B(t,x), are all functions of time 
and space. Under a rotation, or under a Lorentz transformation, they 
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16This is completely analogous to the 
discussion of scalars under translations, 
see eq. (6.22) and Note 3 on page 137. 


asa check, consider a vector field 
E(x) = x. This is a purely radial vec- 
tor field. Then eq. (7.95) gives E’ (x) = 
R(R-'x) = x. Indeed, a purely radial 
field is rotationally invariant and does 
not change under rotations. The same 
holds if we rather write E(x) = cx, with 
c a constant (needed to provide the cor- 
rect dimensions to E), since R(cx) = 
cRx, or even if we take c = c(r) where 
r = |x|. 


therefore transform both because their arguments transform, and be- 
cause of their intrinsic scalar, vector, or tensor nature. To understand 
these transformation properties, let us consider first spatial rotations (in 
which case, we can suppress for simplicity the time dependence). The 
simplest example of a transformation is that of a scalar field. Consider 
for instance the temperature T(x) as a function of the position. The 
numerical values of the coordinates x; of a point P depend on how we 
have chosen the reference frame. If we rotate our reference frame, they 
will change according to x; > x; = R,;x; (for rotations we use here 
the simpler convention of keeping all spatial indices lower and summing 
over repeated lower indices). However, the temperature at the point P 
is the same, independently of how we choose to orient the axes of the 
reference frame, i.e., independently of the labels x; that we choose to 
assign to the point P. This means that, when x > x’, the function T(x) 
must change as 


T(x) > T'(x’) = T(x). (7.92) 


This relation expresses the fact that T will become a new function T” of 
the new coordinate x’, and the functional form of T” must be such that, 
on the new label x’ that we have given to the point P, it has the same 
numerical value that the old function T had on the old label x of P. In 
other words, the functional form will adapt itself to the change of the 
argument, so that, in the end, the temperature at a point P is the same, 
independently of how we have chosen to orient the axes of our reference 
frame.‘® Equation (7.92) can be rewritten as T’(Rx) = T(x), where Rx 
denotes the vector with components R,;x;; then, replacing the generic 
point x by R~!x, we can also rewrite it as 

T'(x) = T(R“tx). (7.93) 
This defines the transformation of a scalar field (in this case, scalar under 
spatial rotations). If, in contrast, we consider a vector field, such as for 
instance the electric field E(x), when the label x of the point P becomes 
x’, with x; = Rijzj, the vector itself (seen as an abstract geometric 
object, e.g., an arrow starting from P with a given length and direction) 
will not change, but now we must refer its components E; to the new 
axes. Thus, under a rotation x; + Rij;x;, they will change as 


Ej(x) = E! (x'’) = Ry; (x) . (7.94) 
This can be rewritten also as E’(Rx) = RE(x), or 
E’(x) = RE(R7'x). (7.95) 


This is the transformation law of a vector field.!” Similarly, a tensor field 
transforms as 


Ti (x) > Tu") = Rik RyiTha(x) - (7.96) 


We now consider a scalar function f(x), and we study how its gradient 
transform under rotations. If x; > x; = Rix; and f(x) > f'(x’) = 
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f(x), we have 


OF(x) _, OF (x!) _ Ox; OF(x) 
Ox; OG. Oa, Oxy - 
For orthogonal matrices, the inversion of x, = Rijt; gives x; = Ryjx 
and therefore 0x; /0x/, = Rij, so we finally obtain 
Of (x) _ Of (x) 
On Oxi ` 
Comparing this to eq. (7.94) we see that the gradient of a scalar function 


transforms as a vector field. A useful notation that makes this result 
more explicit is!® 


(7.97) 
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a? 


> R 


(7.98) 


oo 
Gsi (7.99) 
so eq. (7.98) reads 
d:f(x) > RijO; Fx) (7.100) 


Thus, the index 7 in 0; behaves as a vector index. In the same way we 
can prove, for instance, that 0;0; f(x) is a tensor field with two indices 
(symmetric, since the derivative commutes, assuming as always smooth 
functions) or that, if v;(x) is a vector field under rotations, then ô;v; (x) 
is also a tensor field under rotation, and so on. 

The generalization of these manipulations to the Lorentz group is 
straightforward. A field ¢(a) [where we use the notation x to denote 
collectively (t,x)] is a scalar under Lorentz transformations if, under 
z! + x” = AW x”, it transforms as 


p(z) + g'(x’) = (2). 


Similarly, a (contravariant) four-vector field V” (x) is defined as a field 
that transforms as 


(7.101) 


V” (x) 3 V" (x') = A, V” (x), (7.102) 
while a covariant vector field transforms as 
Va(2) > Vi (2') = A,” Vo(a) 5 (7.103) 
and similarly for tensor fields. We define 
o 0 
pas (7.104) 
Using eq. (7.60), we can check that the inversion of x'” = AM,” is 
ada” (7.105) 
Then, if ¢(a) is a Lorentz scalar, 
g(x) g(x’) 
Ox Ox! 
_ Ox” OG(x) 
Oat Oa 
Od(x 
= Ay oe (7.106) 


Ox” ’ 


18 This is seen most easily writing, in 
matrix form, x’ = Rx. The inversion 
is then x = Rtx’. For orthogonal 
matrices R7! = RT, see eq. (1.128), 
so we get x = RTx’. In components, 
this gives xj = RJT, and, by defini- 
tion of transpose matrix, Ri, = Rij. 
Otherwise, working in components, we 
multiply x; = Rijxj by Rix, to get 
Rix, = RipRijxj and use eq. (1.129) 
in the form Rj, Rij = ôkj. This gives 
Rikt! = £k and we then rename k —> j. 


19 For spatial indices, the upper/lower 
position is irrelevant. However, we will 
see in eq. (7.107) that, for Lorentz in- 
dices, the derivative with respect to a 
quantity with upper index gives a quan- 
tity with lower index, so in the final re- 
sult (7.99) we already use the position- 
ing of the indices appropriate for the 
extension to the relativistic context. 
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20This can also be checked observing, 

for instance, that if we act with 0, on 

the scalar field configuration (x) = 

vyz”, with vy some four-vector inde- 
pendent of x, we get 
Vv 

On (vv 2") = Ped = von =g 

Oxkk 

(7.108) 

so the result is indeed a four-vector with 

lower index. The fact that 0x” /Ox" = 

67, follows from the fact that this 

derivative is one if v = yw and is zero 

otherwise, and it is a tensor, with an 

upper index v inherited from x”. It is 
therefore given by 67. 


so 
plr) > Ap” Ovl), 


which shows that ô „ọ(x) is a covariant four-vector field. Note that 0,,, 
which is the derivative with respect of x” where u is an upper index, 
produces a four-vector with lower index.?° We can also define 


o! = P 


= 3. 3 
OL, 


(7.107) 


(7.109) 


and, from £, = Nyv«”, we can easily prove that 0% = n”, and ô, = 
Nuvo”, so the index u in 0, or ©” can be treated as a normal Lorentz 
index, raised and lowered with nay and n“”. Note that 

One” = On, (7.110) 


while 


Blr” = nt, (7.111) 


as can be seen by contracting both sides of 0,2” = ô% with n"?. 

We can similarly work out the transformations of other quantities 
involving 0, Consider for instance a tensor field T#” (x) and form the 
quantity 0,7"” (x). Under Lorentz transformations 


ƏT” (x) aT" (x') 
Ox! Ox! 
z” O 
= “TAM AY po 
Far pa IAr AT” (2) 
Q V o o 
= Ay AM A ta (x), (7.112) 


where we used eq. (7.105) to compute 02° /0x'". Using eq. (7.60), we 
get 


Aut AH, = 5%, (7.113) 


and therefore 
o TH” (£) > A” o0, T”? (x). (7.114) 


In terms of T” (x) = 0,T"" (x), this reads T” (x) + A’, T° (x), which is 
the transformation law of a four-vector field. Therefore 0,,7"”(zx) is a 
four-vector field. From these examples, it is clear that the transformation 
properties of quantities obtained acting with 0, on tensor fields can 
be read from the remaining free Lorentz indices. So, as we have seen 
explicitly, if ọ(x) is a scalar field, 0, is a four-vector field; similarly, 
we can show that 0,,0,¢ is a Lorentz tensor with two covariant indices, 
etc. If V(x) is a four-vector field, manipulations analogous to those 
performed above show that 0,,V"(z) is a scalar field, while, for instance, 
„Vo (x) is a Lorentz tensor field with two covariant indices; given a 
Lorentz tensor field T#” (x), 0,,T"" (x) is a four-vector field, as we have 
seen explicitly, and similarly 0,,0,T"” (x) is a scalar field, 0,T"”(«) is 
a Lorentz tensor field with one covariant and two contravariant indices, 
and so on. 


7.4 


7.3.7 More general lessons 


We conclude this section by remarking that the two postulates of Special 
Relativity emerged from the extraordinary physical intuition of Einstein. 
From the modern point of view, largely stimulated by Special Relativ- 
ity, as well as by the developments in the theory of fundamental inter- 
actions and quantum field theory, one of the fundamental questions is 
always what are the symmetries of a given theory. Special Relativity is 
a statement about the symmetries of Nature at the most fundamental 
and elementary level, namely, the symmetries of space and time. In 
the mathematical language that we have developed in this section, the 
postulates of Special Relativity are equivalent to saying that, as far as co- 
ordinate transformations are concerned, the symmetry group of Nature 
is given by the Lorentz group SO(3,1), rather than just by its rotation 
subgroup SO(3) [together with Galilean velocity transformations (7.2)]. 
In fact, with one more decade of deep thinking, Einstein went even 
(much) further and realized that, when gravitation enters the game, the 
symmetry transformations are much larger and include all coordinate 
diffeomorphisms. This, however, is the subject of another fascinating 
chapter of physics, General Relativity. Modern particle physics is also 
very largely based on the identification of symmetry groups, in this case 
at the level of the dynamics; this will be the subject of a quantum field 
theory course. For the purpose of the present course on classical elec- 
trodynamics, we can just stress that the revolution initiated by Special 
Relativity has gradually permeated basically all of theoretical physics, 
not only because of its specific concepts, but also for bringing the no- 
tion of symmetry groups to the forefront of the modern understanding 
of Nature. 


7.4 Relativistic particle kinematics 


7.4.1 Covariant description of particle trajectories 


In Newtonian mechanics, the motion of a particle is described by giving 
the evolution of the three spatial coordinates as a function of time, x’ (t). 
In our relativistic setting, for a given inertial observer K that uses co- 
ordinates (t,x), this amounts to giving the spatial components of x” as 
a function of t or, equivalently, of z? = ct. In Special Relativity this is 
not a natural choice, since it obscures the fundamental Lorentz covari- 
ance of the equations, by separating artificially the z? coordinate from 
the three spatial coordinates x’, that together form the four-vector x”. 
Furthermore, the use of time t as a way to parametrize the trajectory is 
also not natural from the relativistic point of view, since time is not a 
Lorentz-invariant quantity. However, we have seen in Section 7.2.3 that 
for a massive particle, moving at a speed v strictly smaller than c, we 
can introduce its proper time 7, as in eq. (7.33). Since, for motions with 
v < c, we have ds? < 0, dr = V—ds?/c is a real quantity. Since dr 
is defined in terms of the interval ds?, it is clearly a Lorentz-invariant 
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quantity, and any inertial observer will agree on its value (apart from an 
arbitrary choice of the origin of time). For the inertial observer K that 
uses coordinates (t,x) and sees the particle moving with velocity v(t), 
the relation between the particle’s proper time 7 and his/her time t is 
given by eq. (7.35). This relation is one-to-one since, from eq. (7.34), 
we see that dr/dt = 1/y > 0, and can be inverted to obtain t = t(r) or, 
equivalently, 

gaa" (sr). (7.115) 


Since t can be expressed as a function of 7, instead of using x(t) the 
observer K can use 
x (rT) =2'[t(7)]. (7.116) 


As a result, the trajectory of the particle is now described by the set of 
four functions {x° (T), x’ (T)} or, in four-vector form, x“(r). In this way, 
the motion of a particle in space-time is described in an explicit covariant 
manner, through a four-vector x”, which is a function of a Lorentz- 
invariant quantity T. From the point of view of Lorentz covariance, this 
is much more natural than using «*(t), in which we have a three-vector 
x’, function of a quantity t, or of x°, which is the temporal component 
of a four-vector. In other words, rather than describing the motion 
of a particle throughout space with a function x(t), as in Newtonian 
mechanics, we prefer to use a parametric form x = x(T) and t = t(r). 
In principle, one could invert the latter to obtain 7 as a function of t, 
T = T(t), and plug this back into x = x(r) to get back the more usual 
description in terms of x(t). However, the description in terms of z” (r) 
has the advantage of being explicitly Lorentz covariant. The functions 
x"(r) define the so-called particle world-line. 

Given a trajectory x” (T), in an infinitesimal interval dr of proper time 
x" (T) changes by an amount dz” (T) = (dx? (T), dx(r)). The interval ds? 
separating the events z” (T) and z” (T) + da" (r) is 


ds? = —[dx°(r)]? + [dx(r)]?. (7.117) 
From the definition (7.33) of proper time, we then have 

[dx (r)|? — [dx(r)]? = dr’, (7.118) 
or, in a more explicitly covariant form, 

Nv da" (T)dz” (T) = —c?dr?. (7.119) 
From a“(7) we can form the four-velocity u”, defined as 


u(r) = at) (7.120) 


Since x” is a four-vector, and 7 is Lorentz-invariant, u/(7) is a four- 
vector. From eq. (7.119) we immediately get 


u? = nuut” =e. (7.121) 
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Consider an inertial frame K, with coordinates (t,x), where the particle 
moves with velocity v(t). Using dr = dt/y from eq. (7.34), x° = ct, and 
dz’ /dt = v', we get 


dz? 
u? = 1S =7c, (7.122) 
dri , 
u = y = =", (7.123) 


from which one can check that eq. (7.121) indeed holds. 


7.4.2 Action of a free relativistic particle 


We now introduce a relativistic generalization of the action principle. 
This will be useful as a first example of a relativistic action principle, and 
also provides a clean conceptual way of defining energy and momentum 
for a free relativistic particle. Recall that, in classical mechanics, for 
a particle described by coordinates q(t), with components q;(t), the 
Lagrangian is a functional of the coordinates and their time derivatives, 
Liqa, å], and the corresponding action is 


S= Jäta dl. (7.124) 
The conjugate momentum is defined by 
ôL 
pm 7.125 
Pi= 55 ( ) 
The Hamiltonian is then defined as 
H[q, p] = 4-p — Lia, ål, (7.126) 


where q is expressed in terms of p (and possibly of q) by inverting 
eq. (7.125). Writing —H|q,p] = Liq,q] — àp, and comparing it to 
the definition of Legendre transform in eq. (5.108) and the discussion 
following it, we see that —H is the Legendre transform of L, and q and 
p are conjugate variables, in the sense of the Legendre transform. 

The simplest example is provided by a particle of mass m in an ex- 
ternal potential V (q). The Lagrangian is 


1 
L= sme —V(q), (7.127) 
where q = v is the velocity of the particle. The momentum is then 
pi = Mi, whose inversion is just ¢; = p;/m, and we get 
2 


H=P +v(a). (7.128) 


2m 
For a free relativistic particle the action must be Lorentz invariant and 
must therefore be obtained from the integration of a Lorentz-invariant 
first-order differential. The only available Lorentz-invariant quantity 
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21 This kind of reasoning, based on the 
most general structure that respects 
some symmetry principle, is quite typ- 
ical of modern field theory. In more 
detail, we wish to construct the action 
for a free relativistic point-like particle. 
The fact that the particle has no inner 
structure means that the only variables 
that we have at our disposal are its 
proper time 7 and the four-vector z(T) 
that describes its trajectory. Requir- 
ing invariance under spatial and tem- 
poral translations implies that the ac- 
tion must be invariant under the trans- 
formation z#(T) + x (T) + a#, where 
a" is an arbitrary constant four-vector. 
This means that any dependence of the 
action on x” (rT) can only enter through 
dz” (r)/dr, i.e., u”. The most gen- 
eral action could then have the form 
S = fdrf(u”), for some scalar func- 
tion f. To construct a scalar out of u” 
we need to contract its Lorentz index 
with another four-vector (or, more gen- 
erally, to consider u” u” and contract its 
two Lorentz indices with a tensor with 
two indices, or to consider u#u” u?’ and 
contract it with a tensor with three in- 
dices, and so on). Since we are consid- 
ering a free particle, there is no external 
four-vector field (such as the gauge po- 
tential A“) that could be used here, and 
the contraction of u, with itself gives a 
constant, because of eq. (7.121). There- 
fore, f must be a constant, independent 
of up. Note that this state of affairs 
changes completely if we assume that 
the particle has an internal structure. 
In this case, new degrees of freedom 
would enter the action; for instance, a 
massive particle could have a spin s, 
that in a covariant setting can also be 
described by a four-vector s”, defined 
by the fact that, in the rest frame of the 
particle, s” = (0,s). The action then 
becomes more complicated, and in gen- 
eral contains an infinite number of pos- 
sible terms that can be organized in or- 
der of importance, in a sense very simi- 
lar to the multipole expansion discussed 
in Chapter 6. This is the logic behind 
the use of effective actions in modern 
field theory. 


(assuming the particle to be point-like and without internal degrees of 
freedom, such as spin) is the proper time of the particle, so we must 


have 
Stree = -o f ar, 


for some constant &œ.?! Using dr = dt/y, we can also rewrite this as 


(7.129) 


ve 
Stree = -a | ayj 1— 2? (7.130) 
so the corresponding Lagrangian is 
ve 

L=-ayf1— 2 (7.131) 

Expanding the square root to order v?/c? we get 

2 4 

L= ataza +0 (5) (7.132) 


Comparison with the non-relativistic limit shows that a = mc?, so that, 
apart from a constant term that has no effect on the equations of motion, 
we get the non-relativistic Lagrangian of a free particle, L = (1/2)mv?. 
Therefore, the action of a free relativistic particle is 


2 
Sree = —MCc fo, 


| 2 
Stree = -Me J dt4/1— 2 i 
č 


7.4.3 Relativistic energy and momentum 


(7.133) 


or, equivalently, 


(7.134) 


From the relativistic Lagrangian, we can get the relativistic momentum 
using eq. (7.125), 


Pi = S -m41 -— a 
= ymv, (7.135) 
and the energy 
E = pv-L 
= mæ. (7.136) 


In conclusion, we have obtained the relativistic expression for the energy 
and momentum of a particle, 


2 
_ Q mMc 
E= PG = t= (v2/c?) ? (7.137) 
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and 


mV 


J1— 2/2) | 


Equation (7.137) shows that, even for v = 0, a particle still has an energy 


p=ymv = 


(7.138) 


E=me’, (7.139) 


associated with its mass (with no doubt, the most famous formula of 
physics for the general public!). Expanding to second order in v?/c?, 
the next term gives the Newtonian expression for the kinetic energy, 


(7.140) 


Comparing eqs. (7.137) and (7.138) to eqs. (7.122) and (7.123) we find 
that 


Eje = me, (7.141) 
p= mi. (7.142) 
Therefore, defining 
pt = mu", (7.143) 
we have 
p“ = (E/c,p). (7.144) 


Since we have already proven that u” is a four-vector, this shows that 
E/cand p form a four-vector. We will refer to p” as the four-momentum 
of the particle. Note, from eq. (7.121), that 


P = Nu” 
-me , (7.145) 
or, explicitly,?? 
E? = mh + jpe. (7.146) 


For physical reasons we only keep the positive root of this equation,?* 


sO 


m?c* + |p|, (7.147) 


which is the dispersion relation for relativistic particles, that we already 
anticipated in eq. (3.50). Equations (7.137) and (7.138) can also be 
combined to give 


(7.148) 
From eq. (7.147), if m = 0 we have 


E=\plc, (7.149) 


22A comment on notation. When en- 
ergy appears in equations where also 
the electric field appears, as was the 
case for instance in eq. (3.54), to avoid 
confusion we use € for the energy and E 
for the electric field (so E = |E| denotes 
the modulus of the electric field). In a 
context such as here, where only energy 
is involved, we use the more common 
notation E for the energy. 


23A really satisfying understanding of 
the negative root only comes from 
quantum field theory. See e.g., Sec- 
tion 4.1 of Maggiore (2005). 
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24-This expression for the rapidity is 
useful in experimental particle physics. 
Note that the rapidity, that we denoted 
by ¢, is also often denoted by y in this 
particle physics context, while the let- 
ter 7 is instead reserved to the ‘pseudo- 
rapidity’, defined as 


_ il (elter) 
n = — log ; 
2 |p] — pr 


For an ultra-relativistic particle, E ~ 
|p|c, and the pseudo-rapidity becomes 
the same as the rapidity. 

A useful property of the rapidity is 
that, if we consider a particle charac- 
terized by a value ¢ of the rapidity, 
ie, with E = mc?cosh¢ and pr = 
mesinh ¢, and we perform a boost with 
rapidity Ço along the longitudinal direc- 
tion, using the analogous of eq. (7.19) 
we get 


E'/c = (E/c) cosh Co + pz sinh Co 
= mc(cosh Å cosh Gp + sinh ¢ sinh Co) 
= mcecosh(¢ + ĉo), 


(7.155) 


and, similarly, 
pi, = mesinh(¢ + ĉo) . 


In other words, under boosts in the lon- 
gitudinal directions, ¢ transforms sim- 
ply as ¢ — ¢€ + ĉo (which is com- 
pletely analogous to the composition of 
angles for subsequent rotations around 
the same axis). Therefore, d¢ is invari- 
ant under longitudinal boosts. 


Inserting this in eq. (7.148) and taking the modulus, we get |v| = c. 
Massless particles always travel at the speed of light and, vice versa, if 
|v| = c then eqs. (7.147) and (7.148) give m = 0. 

Consider now a particle with energy Æ, subject to an external inter- 
action that changes its four-momentum. Applying a time derivative to 
both sides of p p” = —m?c? it follows that 


_ , dpe 
Pur H 
EdE __ dp 
a a (7.150) 


Therefore, using eq. (7.148), we see that, in a full relativistic setting, we 
still have 


TEV (7.151) 


In the non-relativistic limit, dp/dt is equal to the force F acting on the 
particle, and we recover the result that dE/dt = v-F, i.e., dE /dt is equal 
to the work per unit time performed by the external force. 

The fact that p” is a four-vector immediately tells us how energy and 
momentum change under Lorentz transformations. They will simply 
transform as any other four-vector; so, for instance, if we make a boost 
with a velocity vo along the x axis, we can read the transformation from 
eqs. (7.22) and (7.23) with x° replaced by E/c and x replaced by pe, 
which gives 


(7.152) 
(7.153) 


E" = y(vo)(E + vopz) ; 
Vv 
Pie (vo) (pe + DE) , 


while pi, = py and p’, = pz. In particular, if before the boost the particle 
is at rest, after the boost we will have E = y(vo)mc? and pr = y(vo)mvo, 
where we eliminated the prime from the boosted quantity and we used 
the more general notation pz for the component of the momentum in 
the direction of the boost, i.e., in the longitudinal direction. Eliminating 
vo in terms of the rapidity Ç from eq. (7.21), we have y(vo) = cosh Ç, so 
E = me cosh¢ and pr, = mcsinh¢. Then, we have 


(Elo) +p: _ x 


(Bye) — pp = (7.154) 
and therefore?4 
“1 [Ed +r 
c= ptor eae ee 


Covariant formulation of 
electrodynamics 


We now use the formalism developed in the previous chapter to rewrite 
Maxwell’s equations in a form that will make explicit their covariance 
under Lorentz transformation. We find it convenient to start by study- 
ing the source term, i.e., the charge and current density. After having 
established that they can be assembled into a four-vector, we will then 
see how to write Maxwell’s equations in a covariant form. 


8.1 The four-vector current 


We begin by studying the Lorentz transformation properties of the 
charge density p and of the current density j. For a single point-like 
charged particle, the charge density and the current density have been 
given in eqs. (3.25) and (3.27). If we denote by x(t) the trajectory of a 
particle with charge q, and by x a generic point in space, we can rewrite 
these expressions as 


96) [x — x(t)], 
= qv(t)o [x — x(t], 


(8.1) 
(8.2) 


where x = (ct,x) and v(t) = dx(t)/dt. A simple way of understanding 
their properties under Lorentz transformations is to describe the trajec- 
tory using the four-vector z” (r), defined as in Section 7.4.1, instead of 
x(t). Indeed, consider the quantity 


j“(x)= a f ear u” (764) [z — a(r7’)], 


(8.3) 


where the factor of c is inserted for later convenience. Note that u” (T) 
dx"(r)/dr is a four-vector and proper time T is a Lorentz scalar. Fur- 
thermore, by definition, 


J dir 54 (x) =1. (8.4) 
Under a Lorentz transformation z” > 2/" = AM,a2” we have 
dtr > d'r’ = (det A) dtz. (8.5) 


Since det A = 1, dfx is Lorentz invariant. From eq. (8.4), then, also 
6) (x) is Lorentz invariant.! It then follows that j(x), defined by 
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Solved problems 


lEquivalently, under a generic linear 
transformation z” > x’ = A#pr” we 


have 


and, 


1 
54) 5(4) 
(2) => 8), 


for a Lorentz transformation, 


(8.6) 


det A = 1. 
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2The solution exists and is unique 
because, for any acceptable physical 
trajectory (e.g., a trajectory that do 
not go back and forth in time) the 
parametrization of the trajectory in 
terms of proper time is in one-to- 
one correspondence with that in terms 
of “coordinate time” t. Furthermore, 
dx? /dr > 0, i.e., t and 7 increase in the 
same direction. 


3The charge given in eq. (8.11), be- 
side being conserved, is also a Lorentz 
scalar. This is not readily apparent 
from eq. (8.11) because the integration 
is only over the spatial variables dx, 
and the integrand, j° is the temporal 
component of a four-vector. The ex- 
plicit proof is somewhat technical, and 
we give it in Solved Problem 8.1, where 
we also provide an explicitly Lorentz- 
invariant expression for it. 


eq. (8.3), is a four-vector, since it is obtained multiplying the four-vector 
ut (T') by the Lorentz invariant quantity 6) (x) and integrating over the 
Lorentz invariant differential dr’. The integral over dr’ in eq. (8.3) can 
be computed explicitly using eq. (1.61), that we rewrite with the nota- 
tion 1 
1 t 

FON = d re mi) (8.7) 
where 7; are the simple zeros of the function f(r’). We use this identity 
for f(r’) = ct — x°? (T') and we denote by rT the value of T’ such that ct = 
x°(r’),? which is nothing but the proper time of the particle considered. 
Then eq. (8.3) can be rewritten as 


j” (x) 


q J cdr! u” (T')8® [x — x(7’)] d[ct — x° (T) 


a : ed! u(r) fx — x( 7) I7 air’ =) 
= qe OO — xr), (8.8) 


where we used u? = dx°/dr. Recalling, from eqs. (7.122) and (7.123), 
that u? = yc and ut = yv', and comparing with eqs. (8.1) and (8.2), we 
see that 
j" = (ep,j) - (8.9) 

Therefore, the charge density p (times c) and the current density j, de- 
fined by eqs. (8.1) and (8.2), are the temporal and spatial components, 
respectively, of the contravariant four-vector j” defined by eq. (8.3). 
Since any distribution of charges and currents can be obtained by super- 
position of individual charges, eq. (8.9) is completely general. If we lower 
the Lorentz index, according to our metric signature nay = (—, +, +, +), 
we get ju = (~cp,)). 

In terms of j”, the conservation equation (3.22) can be rewritten in 
an explicitly Lorentz covariant form as 


Also observe that, in the covariant language, the total charge Q over all 
of space is obtained from a volume integral of the 4 = 0 component of 
the four-vector 7", 


Q= TEZKE (8.11) 


Since the right-hand side is integrated over x, but not over t, a priori the 
left-hand side could have been a function of time. However, as we saw in 
Section 3.2.1, current conservation implies that the charge Q is conserved 
in the sense of eq. (3.24), so, in particular, dQ/dt = 0 if the integral in 
eq. (8.11) is over all space, and we set the boundary condition that j has 
compact support, or anyhow vanishes sufficiently fast at infinity.” 
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8.2 The four-vector potential A, and the 
Fi, tensor 


We next consider the scalar potential (a) and the vector potential A(x). 
We combine them into a contravariant four-vector field A” (<x), 


A" = ($/c, A), (8.12) 


and we will show in Section 8.3 that, with this assignment, i.e., with 
this definition of how ¢@ and A behave under Lorentz transformations, 
Maxwell’s equations are Lorentz covariant, so that, if they hold in a 
frame, they also hold in a Lorentz-transformed frame. For rotations, 
this is already explicit from the vector form (3.8)—(3.11). The new result 
will be that they are covariant under the larger Lorentz group, i.e., that 
Special Relativity is an underlying symmetry of electromagnetism. 

First, it is useful to observe that the assignment of ¢ and A to the four- 
vector field A” is consistent with gauge invariance, and in fact allows us 
to write the gauge transformation (3.86) compactly. This can be shown 
observing first that, lowering the Lorentz index, 


A, = (—ġ/c, A). (8.13) 


Then eq. (3.86) takes the simple and elegant form 


Ay > A, = Ay — 0,8. (8.14) 


Taking 0 to be a Lorentz scalar function, 0,0 is a four-vector field, 
and the gauge transformation preserves the fact that A, is a covariant 
four-vector field. For the contravariant field A”, raising the indices in 
eq. (8.14), we have A” — AF — OHO. 

We next introduce the Lorentz tensor field 


FH = ah AY — g” AB, (8.15) 


This tensor is antisymmetric and therefore has six independent com- 
ponents. We can work them out explicitly as follows. Consider first 
p”, 


p% = al A’ = ŻAL 
: (—0,A° — 6°9) . (8.16) 


© c 
Comparing with eq. (3.83) (and recalling that 0’ = ð; is the i-th compo- 
nent of V) we discover that F® is just the i-th component of the electric 
field (divided by c),4 


a 
= oe (8.17) 


“The extra factors of ¢ or 1/c in several 
of the subsequent formulas, as well as 
the relative factor of c between the tem- 
poral and spatial components of A“ in 
eq. (8.12), are absent in Gaussian units, 
see App. A, and this is among the rea- 
sons why Gaussian units provide nicer 
expressions for relativistic equations. 
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5Note that the signs in the matrix el- 
ements depend on our choice of metric 
Nuv = (—,+,+,+). To compare with 
a text that uses nuv = (+,-,-,—) 
[such as Jackson (1998)], one must ob- 
serve that A“ is always defined in terms 
of ¢ and A from eq. (8.12), but now 
Ay = Nuv A” differs from ours by an 
overall minus sign (while 0,, = 0/0z" is 
unchanged because it involves z” with 
an upper index); therefore also Fy, = 
3 A, — 0, A, has the opposite sign, 
and the same holds for F#”, since we 
need two 77 factors for raising its indices, 
FRY = Pv? Fag, 


6 As remarked in Note 5, if one rather 
uses the signature nuv = (—,+,+,+) 
for the metric, opposite to ours, F#” 
also differs from our definition by an 
overall sign, and then eq. (8.23) is re- 
placed by 0, F#” = +uoj”. 


or E;/c = — Foi = Fio. Similarly (keeping for simplicity all indices lower, 
since spatial indices are raised and lowered with 6;;), 
Eijk (jAk — kAj) 
2€ijkðj Ak , 


Cighl ik = 


(8.18) 


where in the last equality we used the antisymmetry of €;;,. However, 
€;;,0; Ax is just (V x A);. Then, comparison with eq. (3.80) shows that 


1 
B; = 3 Sisk 5k (8.19) 
which can be inverted to give 
Fij = €ijk Bk - (8.20) 


Thus, the six independent components of Fy are given by the three com- 
ponents of E and the three components of B. Explicitly, as a matrix,’ 


0 E\/c E2/c E3/c 
—EF,/c 0 B3 — Bə 
w 
Fi) Daje -B 0 B (8.21) 
—E3/c By — Bı 0 
Under a gauge transformation (8.14), 

Fav > Fu — (3 p00 — 0,0,9) 

= Fy, (8.22) 


(we always consider transformations such that the function @ is infinitely 
differentiable, and therefore on it the derivatives commute), so Fy As 
gauge invariant. Indeed, we have already seen in Section 3.3 that the 
electric field and the magnetic field are gauge invariant. 


8.3 Covariant form of Maxwell’s equations 


We are now ready to write Maxwell’s equations in covariant form. Con- 
sider the equation 


ð, F” = — poj” , (8.23) 


where j” is given by eq. (8.9). Note that eq. (8.23) is a four-vector 
equation, since the index v is free. For v = 0, „F HO is the same as 
0;F, since F°° = 0. Then, using F® = —E*/c and j° = cp, we see 
that eq. (8.23) with v = 0 is the same as 


V-E = mep. (8.24) 
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For v = i, instead, 
ð F = F” +F] 
1 l 7 
= aoe — d;e49* BF 


1 i 
= Fae -Vx B (8.25) 
Therefore, eq. (8.23) with v = i reduces to the vector equation 
OE : 


Note that eqs. (8.24) and (8.26) are written in terms of only two param- 
eters: 4o, that appears in eq. (8.23), and c, which physically represents 
the speed of light and, formally, entered these equations through the 
definition of the Lorentz group [defined by the condition that Lorentz 
transformations leave invariant the quadratic form (7.13)] and, from 
there, entered all other formulas of this chapter through x“ = (ct,x), 
o = (1/c)Q, and so on. 

We see that eqs. (8.24) and (8.26) are the same as eqs. (3.1) and (3.2) 
once we make the identification 


EoHo = 3° (8.27) 


The quantity that we formally denoted by c in eq. (3.7) is therefore the 
same as the speed of light that we are using here, as the notation in 
eq. (3.7) already anticipated.” 

In terms of A”, eq. (8.23) reads 


A” — 8"(0,A") = uoj” , (8.28) 


which puts together eqs. (3.84) and (3.85) into a single four-vector 
equation. Also observe that the Lorenz gauge condition (3.89) takes 
the compact form 


ðA" =0. (8.29) 


Therefore, in the Lorenz gauge, eq. (8.28) becomes a simple wave equa- 
tion, 


A" = — uoj” ’ 


(8.30) 


which, again, unifies the scalar and vector equations (3.90) and (3.91) 
into a single four-vector equation. 

Let us now turn to the second pair of Maxwell’s equations, eqs. (3.10) 
and (3.11). These equations do not involve the sources and we have 


"We will confirm the interpretation of 
1/,/€oft0 as the speed of light in Chap- 
ter 9, where we will see that it is in- 
deed the velocity at which electromag- 
netic waves travel in vacuum. 


8The equation with v = 7% reduces 
indeed trivially to eq. (3.85). Equa- 
tion (8.28) with v = 0 can be written 
as 


(¢/c) — 0°[80(¢/c) + V-A] = —uocp . 


Using O — 8°4 = V?, 8 = —a& = 
—(1/c)0; and uoc? = 1/e9 we get back 
eq. (3.84). 
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seen that, once the electric and magnetic fields are written in terms of 
the gauge potentials, they reduce to mathematical identities. In the 
covariant formalism this comes out as follows. We define the tensor 


PHV 1 vpo 
per z” Po Bag (8.31) 


This is called the tensor dual to Fav. Note that, since e#”?? is totally 
antisymmetric, we can also rewrite it as 


~ 1 
Fey = zi (Op Ae — ðs Ap) = 7 OAS. (8.32) 


Again because of the antisymmetry of e4”??, we have the mathematical 
identity €“”?70,,0, = 0 (as an operator acting on any infinitely differen- 
tiable function, on which ð, and 0, commute), since 0,,0, is symmetric 
in (u, p) and its contraction with the antisymmetric tensor e#”?? van- 
ishes. Therefore, 


3LF” = 0. (8.33) 


Proceeding as before (and observing that €°”°? is different from zero 
only when v,p,o are all spatial indices, and €°7* is equal to the three- 
dimensional tensor ¢’/*) one can check that, for v = 0, eq. (8.33) gives 
eq. (3.10), while for v = i it gives eq. (3.11). 

Equations (8.23) and (8.33) are therefore Maxwell’s equations in co- 
variant form. From these expressions, it is explicit that the underlying 
symmetry of Maxwell’s equations is given by the Lorentz group, i.e., spa- 
tial rotations and Lorentz boosts, since both sides of eq. (8.23) transform 
as a four-vector, while eq. (8.33) sets a four-vector to zero, which is of 
course also a condition that, if it holds in a reference frame, holds in 
any other frame related by a Lorentz transformation. The covariant for- 
malism therefore unveils a symmetry that, in the original formulation 
(3.8-3.11), was actually already present, but was not readily visible. 


8.4 Energy-momentum tensor of the 
electromagnetic field 


We can now use the covariant formalism to discuss the energy and mo- 
mentum of the electromagnetic field. Let us consider the tensor 


1 1 

TH = (reer + REE) . (8.34) 
Ho 4 

This is called the energy-momentum tensor of the electromagnetic field, 

for reasons that we will now explain. The (00) component can be written 

as 


00 = o1 [Fone E 


r (2F Fo; + F” F4) 


1 
4 


I 14 i 
=e (GPR iPr) (8.35) 
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since Fo = — Fio = Foi. 
eq. (8.27), we get 


Using eqs. (8.17) and (8.20), as well as 


= ; (e9E” + ug 'B’) . (8.36) 
Comparing with eq. (3.43), we see that T% is the energy density u of the 
electromagnetic field. Similarly, setting u = 0 and v = i in eq. (8.34), 
we find that 
ola 

T= z5 : (8.37) 
where S is the Poynting vector (3.34). Given that TH” is symmetric, we 
also have T? = T”. Finally, setting u = i and v = j in eq. (8.34), and 
using again eq. (8.27), we get 

ij 1 9 =i [4a 

which is the same as eq. (3.64). We therefore see that the energy density 
u of the electromagnetic field given in eq. (3.43), 1/c times the Poynting 
vector S* given in eq. (3.34), and the Maxwell stress tensor given in 
eq. (3.64), form together a Lorentz tensor T#”.? 

The conservation equations discussed separately in Section 3.2.2 can 
then be derived, in a unified manner, in terms of the Lorentz tensor 
T””. Applying ô, to both sides of eq. (8.34), and using the equation of 
motion (8.23), we find!? 


bT” = -F jp. (8.39) 
In particular, setting 4 = 0, we have 
oT” + T” = —F";,. (8.40) 


Using ðo = (1/c)ðs, 77 = u, T® = T” = (es and FY = Et /c, we 
can rewrite this as 


ðu +V-S=-Ej, (8.41) 


which is the same as eq. (3.47). Energy conservation is therefore the 
v = 0 component of eq. (8.39). 

Similarly, recalling that jo = —cp, the u = i component of eq. (8.39) 
gives 


DT” +9;T9 = —F jo- F" jx 
= —pE'’ — cikjrBiı 


~(pE+jxB);. (8.42) 


Writing again ĉo = (1/c)0;, and observing, from eqs. (3.57) and (8.37), 
that 


T(x) = cg (x), (8.43) 


we get 
gi T. 


(8.44) 


Tt is also interesting to observe that 


TH” is traceless. In fact, using 
Navn” = 4 in eq. (8.34), 
1 
Nv TY = —— (FUP Fou + FP? Foo) 
Ho 
1 
= (FY? Fup = Fe? Foe) 
Ho 
= 0. 


In the quantum theory, this is related 
to the fact that the photon is a massless 
particle. 


10 The explicit computation is as fol- 
lows. Applying 0, to both sides of 
eq. (8.34), we get 


—HoOTHY = (Op FMP) FY 
1 
+F’ 8 Fo” + g (OY FP?) Foo ; 
Using the equation of motion (8.23), 
the term on the right-hand side, in the 
first line, can be rewritten as 
(0, FPP) FA” = —poj? Fp” . 


The sum of the terms in the second 
line, instead, vanishes. In fact, we can 
rewrite it as 


Fp pô” F” + (O" OP A?) Foo 
= Fop FP” + (8” OPA?) Foo 
= Foo [0” 0° AF — 0° (8P A” — Ə” AP)| 
= Foo [0" (OP A7 + 0° AP) — 8P 87 A”). 
We now observe that Fpo is antisym- 
metric in (p, 0), while the expression in 
brackets is symmetric. Therefore, the 
contraction vanishes. We are then left 
with 

OnE = Gup . 
Renaming the indices w + v and us- 
ing TH’ = TY’ and FHY = —FYH, we 
finally obtain eq. (8.39). 
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which is the same as eq. (3.62). This confirms the result found in 
eq. (3.67) with a non-covariant formalism. Note that the scalar and 
vector conservation equations (3.47) and (3.62) are now unified into the 
single covariant equation (8.39). 

Since T°° is the energy density of the electromagnetic field, and T/c 
is the momentum density, the energy and momentum carried by the 
electromagnetic field are given by 


is = J &aT™, (8.45) 
1 3 Oi 

= | rT”, (8.46) 
C 


where the integral extends over all of space (or over a finite volume V, 
if we are interested in the energy and momentum of the electromagnetic 
field in that volume). As can be expected by analogy with the result 
(7.144) valid for point particles, the quantity 


i 
Fon 


Pe = (Eara; Pia) (8.47) 
= al cor, (8.48) 


is a four-vector. The proof is analogous to the one used to show that the 
charge Q given in eq. (8.11) is a Lorentz scalar, and we give it explicitly 
in Problem 8.2. 


8.5 Lorentz transformations of electric 
and magnetic fields 


The results of Sections 8.2 and 8.3 show that, from the point of view 
of Lorentz transformations, the vectors E and B are not the spatial 
component of a four-vector, as one could have naively guessed. There 
is no “E°” that combines with E to form a four-vector, and similarly 
for B. Rather, Maxwell’s equations become Lorentz covariant when we 
assemble E and B together into a different representation of the Lorentz 
group, the antisymmetric tensor F*#”. 

From this, we can immediately derive how E and B transform under 
Lorentz transformations. Under a Lorentz transformation 


gaa = AM a’, (8.49) 
we have 
Fi» (a) > Fi (2') = Ap? Ay? Foo (2), (8.50) 
or, in matrix form, 
F(z) > F’(a’) = AF(x)A’. (8.51) 


Let K be an inertial frame where the electric and magnetic field are E(x) 
and B(x), and let K’ be the reference frame of an observer that moves 
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with constant velocity v = vx with respect to K. The coordinates of the 
two frames are then related by a boost along the x axis, as in eqs. (7.22) 
and (7.23), with vg = —v (since, if v > 0, K’ moves in the direction 
of the positive x axis with respect K, so a particle at rest in K moves 
toward the negative x axis in K’). Then, A“, is given by the matrix in 
eq. (7.20), with tanh ¢ = —v/c. Performing the matrix product explicitly 
in eq. (8.51) we get, for the electric and magnetic field seen by the K’ 
observer, 
E,=E,,  E,= (B2—vBs), 


E3=7(£3+vBe), (8.52) 


and 


Bl =B,, B, = 7 (Ba + GBs) By = y (Bs - GE) 
(8.53) 

where, for notational simplicity, we have not explicitly written the argu- 
ment x in E(x),B(x) and the argument 2’ in E’(2’),B’(2’). These 
expressions can be rewritten in a rotationally invariant form (which 
therefore holds for boosts in a generic direction) denoting by Ej the 
component of E parallel to the boost axis (so, for a boost along the x 
axis, E = E,x) and by E, the component transverse to the boost axis 
(so, for a boost along the x axis, E, = E,y + E,2), and similarly for 
B. Then eqs. (8.52) and (8.53) can be written as! 

Ej =E], E’ = y(EL +v x B_), (8.57) 
and 


Bi =B], Bi =7(Bi-vxE./c’). (8.58) 


Consider, for instance, a frame where we have an electric field along the 
z axis, E = Ez, while B = 0; an observer that moves with velocity 
v = vx with respect to this frame will see an electric field E = yEz and 
a non-vanishing magnetic field B = y(v/c?)Ey. The latter result is at 
first very surprising. Boosting an electric field we generate a magnetic 
field! This is at the core of the fact that Maxwell ’s equations unify 
electric and magnetic phenomena into electromagnetism, and E and 
B are deeply interrelated, to the extent that, with a boost, we can 
generate a magnetic field from an electric field (and, conversely, we can 
generate an electric field from a magnetic field). At the mathematical 
level, this is expressed by the fact that E and B do not belong to two 
separate representations of the Lorentz group but, rather, together fill 
the component of an antisymmetric tensor Fuy, and therefore mix under 
Lorentz boosts. 

Physically, this can also be understood from the fact that the charge 
density and the current density form a four-vector, eq. (8.9). Suppose 
that in the frame K we have just an electric charge at rest at the ori- 
gin. This will generate a radial electric field, but no magnetic field. In 
the boosted frame K’, however, the charge is in motion with velocity 
v. It therefore produces a current j, that generates a magnetic field 
perpendicular to its direction of motion. 


11 Equivalently, we can write v xE] as 
v x E, and v x By, as v x B, given 
that v x Ey = v x Bj = 0. Another 
expression equivalent to eqs. (8.57) and 
(8.58) is 


E’ = 7(E + vxB) — (7 


1)¥(¥-E), 
(8.54) 
and 


B’ = 7(B- vxE/c”) — (y — 1)¢(¢-B) , 

(8.55) 
where 8 = v/c. The equivalence of 
eqs. (8.57) and (8.54) can be shown 
writing E= E t Ej E’ = E’ t Eip 
and observing that E) is the projection 
of E in the direction of the velocity, so 
E| = (E-v)v, while vxB is transverse 
to v so, in eq. (8.54), it only contributes 
to E,. Then eq. (8.54) gives 


FE’, =7(BE, +vxB), (8.56) 


and vxB = vxB,, so we recover the 
second equation in eq. (8.57). For the 
parallel component, writing Ej = E\\v, 
Ej = EW, and using VE = Ej, 
eq. (8.54) gives 
Ey =E- (¥- 1) E|; 

so Ey = Ej- Equation (8.54) is there- 
fore equivalent to eq. (8.57). One pro- 
ceeds in the same way to show the 
equivalence of eqs. (8.55) and (8.58). 
Observe that the second term on the 
right-hand side of eq. (8.54) can be 
rewritten using the identity 


2 
(y-1)0(¥-E) = 7 P(E), 
y+ 

where B = v/c. This identity can be 
shown writing y? = 1/(1 — 6?), which 
gives 8? = (y? — 1)/7?. One can sim- 
ilarly rewrite the second term on the 
right-hand side of eq. (8.55). 
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Exercise 8.1 Using eqs. (8.52) and (8.53), show that E? — c?B? and 
E-B are Lorentz invariant. Try to find an explicit Lorentz-invariant 
expression for E? — c?B?. 


Exercise 8.2 Compute the Lorentz-invariant quantities F, F“” and 
F, F”” in terms of E and B. 


Exercise 8.3 Without using the covariant formalism, and rather work- 
ing with E and B, show that, under the transformation (8.52, 8.53), 
the full set of Maxwell’s equations (3.8-3.11), with the source terms 
set to zero, is invariant (in the sense that, if the original fields satisfied 
Maxwell’s equations, the transformed fields also satisfy them). Is each 
Maxwell equation separately invariant? Repeat the exercise including 
the source terms, taking into account that, according to eq. (8.9), cp and 
j transform as a four-vector. 


Exercise 8.4 As an example of the usefulness of a covariant formalism, 
consider a set of equations, involving two two-dimensional vectors a = 
(ax, ay) and b = (bz, by), and two more quantities A, B, given by 


Ba = Ab, Eaplabg = 0, (8.59) 


where the indices a, 8 take the values 1,2 (or, equivalently, x,y) and 
€yg is the antisymmetric tensor in two dimensions, €j2 = —€21 = 1, 
€11 = €22 = 0. Show that this system of equations is invariant under 
rotations in the (x,y) plane, under which a and b transform as two- 
dimensional vectors, and A and B as scalars. Can you find a formalism 
that explicitly shows that this system of equations actually has a much 
larger symmetry, which was not apparent from eq. (8.59)? Can you 
draw the analogy with our discovery of Lorentz invariance in Maxwell’s 
equations? 


8.6 Relativistic formulation of the 
particle-field interaction 


8.6.1 Covariant form of the Lorentz force equation 


The final step, to obtain a fully covariant formulation of electrodynamics, 
is to write also the interaction of a particle with the electromagnetic 
field in a relativistic form. We therefore investigate whether there is a 
covariant expression, whose spatial components give the vector equation 
(3.6). To this purpose, it is useful to rewrite eq. (3.6) in terms of proper 
time, using dr = dt/y. We also recall, from eqs. (7.122) and (7.123), 
that u? = yc, ut = yv'. Equation (3.6) can then be rewritten as 


dp 


= ~ qy (E+ v x B) 


ma 
q (=e +ux B) ; (8.60) 
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or, in components 


dp’ E’ 
g ~’ (-w= F enuy Ba ) 
= q(F°®uo + F” uj), (8.61) 
where we used eqs. (8.17) and (8.20), together with F° = —F*° and 


Fi = Fij. Recalling, from eq. (7.144), that p” = (E/c,p’) (where here 
we denote the energy of the particle by €, since we reserve the notation 
E to the modulus of the electric field E), we recognize this as the u = i 
component of the equation 


dp” > 

Po Few, (8.62) 
or, equivalently, using eq. (7.143), 

dp! q 

— = Ap pya : 

dt m p Aea 


Therefore, eq. (3.6), with p interpreted as the full relativistic momen- 
tum, 


p=y(v)mv, (8.64) 


is the spatial component of a covariant equation, and therefore is also 
valid relativistically. Explicitly, in terms of the velocity, the spatial 
component of the relativistic Lorentz “force” equations (8.62) ist? 


4 [y(v)mv] =q (E +v x B). 


a (8.65) 


This covariantization also carries with it the u = 0 component, that must 
correspond to another equation that could have been derived with the 
non-covariant formalism. Indeed, using again dr = dt/y and ui = yvi, 
the u = 0 component of eq. (8.62) reads 


aa =qE-v. 


i (8.66) 


In fact, this is just eq. (7.151), that in our present notation reads dE /dt = 
v-dp/dt, with dp/dt expressed through eq. (3.6). Note that only the elec- 
tric field contributes to the work. This is a consequence of the fact that 
the contribution to dp/dt from the magnetic field, gvxB, is orthogonal 
to v, and therefore does not contribute to v-dp/dt. 

This derivation shows that the Lorentz force equation, when written 
in the form (3.6) with p given in terms of the velocity as in eq. (8.64), 
is not just the low-velocity limit of some fully relativistic expression, 
but is in fact already the spatial component of a four-vector equation, 
and is therefore correct even at relativistic velocities, as the covariant 
formalism makes explicit. 


12 As mentioned in Note 2 on page 40, 
the use of the word “force” here is an 
abuse of language, since force is associ- 
ated with the Newtonian concept of in- 
stantaneous action, in which the force 
between two particles depends on their 
instantaneous positions. What makes 
eq. (8.65) fully consistent with the prin- 
ciples of relativity is that the force on 
a particle, at time t and position x, is 
expressed in terms of the electric and 
magnetic fields at the same value of t 
and x but, as we will see in Chapter 10, 
these are determined in terms of the 
“retarded position” of the source, i.e., 
the position that the source had at an 
earlier time, consistent with the propa- 
gation of signals at the speed of light. 
A more appropriate name for eq. (8.65) 
could be “the relativistic equation of 
motion for a particle in an external 
electromagnetic field.” We will, how- 
ever, often use the common expression 
“Lorentz force” even in the relativistic 
setting. 


194 Covariant formulation of electrodynamics 


This subsection is more advanced and 
can be skipped at first reading. 


13 At this stage, a multiplicative con- 
stant could simply be reabsorbed into 
q, rather than already assuming that q 
is the electric charge. However, we will 
see below that eq. (8.68) is indeed the 
action whose equation of motion gives 
the Lorentz force equation, with q iden- 
tified with the electric charge without 
extra multiplicative factors. 


8.6.2 The interaction action of a point particle 


An alternative derivation of eq. (8.62), more in line with typical rea- 
soning used in modern theoretical physics, is as follows. We saw in 
eq. (7.133) that the action of a free (point-like) particle is given by 


Stree = -më far. (8.67) 


Let us understand the form of the action that describes the interaction 
of this particle with the electromagnetic field A”. Just as we did for 
deriving the free action (8.67), we use symmetry principles, observing 
that, in order to respect the Lorentz covariance that we have discovered 
in Maxwell’s equations, also the interaction action must be Lorentz in- 
variant. The variation of a Lorentz-invariant action with respect to a 
four-vector dynamical variable such as z” (r) will then produce Lorentz- 
covariant equations of motion. 

The integration variable in the interaction action must therefore be 
again dr, which is the only Lorentz-invariant generalization of the inte- 
gral over dt that appears in the non-relativistic action of a particle. The 
interaction action must also involve the gauge potential A“(x), com- 
puted on the trajectory «“(7) of the particle. To obtain a Lorentz- 
invariant action we therefore need something to saturate the Lorentz 
index of A“(a), and, for a point particle without an internal structure, 
the only other four-vector that we have at our disposal is its four-velocity 
u(r). The simplest possibility is then a linear coupling, proportional to 
up(T)A"[x(7)]. The interaction must also be proportional to the charge 
q of the particle: if g = 0, there is no interaction between the particle 
and the electromagnetic field. We are therefore led to postulating an 
interaction action 


Sin =a | druy(r)A"le(r)), (8.68) 
apart from an overall numerical multiplicative constant.!% 
The total action S = Sree + Sint of a point-like particle in an external 
electromagnetic field can therefore be written as 


S= -me / dr+q J dru, (T) A"[x(7)] . (8.69) 


We can now express dr in terms of dt using dr = dt/y and, correspond- 
ingly, we use t and x(t) instead of z” (r) in the arguments of ¢ and A. 
Using eqs. (7.122) and (7.123), as well as eq. (8.12), we get 


2 
S= je -ne 1— a — qoft, x(t)] + qv-Aft,x(t)]| . (8.70) 
Then, the Lagrangian is 
v(t) 


L{x(t), v(t)] = —me?4/1— —qélt,x(t)]+qv-Alt,x(t)].| (8-71) 


c2 
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It is also interesting to observe that eq. (8.68) can be rewritten as 


Sint = 4 fo ats) pes je — x(T)]A (£). (8.72) 


Exchanging the integrals over dfx and over dr, this can be rewritten as 


Sm == f ate (e)AuCe), (8.73) 


where j”(x) is the current defined in eq. (8.3). Observe that, in this 
expression, the argument x in j” (x) and in A,,(x) is the generic space- 
time point, while in eq. (8.71) the spatial argument was evaluated on 
the position x(t) of the particle at time t. 

Since a generic current j“(x) can always be thought of as generated by 
a superposition of point charges, eq. (8.73) can be taken as the general 
form of the interaction of an arbitrary four-current density j“(a) with 
an external electromagnetic field.'4 More explicitly, using eqs. (8.9) and 
(8.13), in terms of p(t,x), j(t,x), (t,x) and A(t,x), we have 


Sm = | dtd (-p(t.x) (t,x) Al] (8.74) 


where we also used dx? = cdt. 
The action (8.73) is gauge-invariant, as long as the current j” is con- 
served. Indeed, under the gauge transformation (8.14), 


1 
Sint = Sint Pome, pes Jr (t; x), 0 
Cc 


1 
Sint + F jets 00 j” (t,x) 
= Sint. (8.75) 


Note that, in the integration by parts, we dropped the boundary terms. 
This can be justified, restricting to gauge functions (x) that go to zero 
sufficiently fast at infinity (or to a localized current j”). We see that 
gauge invariance requires current conservation.!° 

The equation of motion can now be obtained from the Lagrangian 
variational principle: for a system with coordinates q;(t), 


d ÔL L 
We use the Lagrangian in the form (8.71), where the role of q;(t) is 
played by «;(t), while ¢; = v;i. We compute the derivatives explicitly. 
First of all, 


ôL si vT -w ; 
a mes (1 =) 2 + qA'*(t,x) 


= ymv' + qA'(t,x) 
p'+qA‘(t,x). (8.77) 


=0. (8.76) 


14 Observe that we are considering here 
the electromagnetic field as a given ex- 
ternal field. We will include the dynam- 
ics of the electromagnetic field at the 
level of the action in Section 8.7, see in 
particular eq. (8.129). 


15This is a point with important im- 
plications in quantum field theory, on 
which we will not elaborate further 
here. See e.g., Chapter 7 of Maggiore 
(2005). 
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Note that, by definition, the derivative of the Lagrangian with respect to 
the velocity is the momentum conjugate to the coordinate x’. Therefore, 
in the presence of an electromagnetic field, the momentum conjugate to 
the coordinate x’, that we denote by P’, is not just the same as the 
momentum pê = ymv’ of a free particle but, rather, is given by 


P=p+qA. (8.78) 
We next compute 
dsi i = dp' oA’ (t, x) da) (t) yi 
ae +aatex ol} = Era [EA a EO Aa) 
o dp OAD). ja 44 
= +q | ap T” jA’ (t, x)| . (8.79) 


Note that, for spatial indices, we do not need to be careful about the 
upper/lower positioning, and we can equally well keep all of them lower 
and sum over repeated lower indices. We finally compute the variation 
with respect to 2;: si 

Ox; 
(Recall that in the variation performed to obtain the equations of mo- 
tion, qi and q; are taken as independent variables). We now write 


vjð; Aj z vj(0; Aj z 0; A;) + uj0; Ai 
= Uj€igkPr + vj0; Ai 3 (8.81) 
and putting everything together, we get 
dp OA’ . . A ; 
0 = a +q (= + asa) — qleijkV; Br + v0;A') + q0io 
dp? ðA! 
= di +q (= — cuntyBr + qo: , (8.82) 


where we observed that the two terms proportional to v/0;A* canceled 
among them. Therefore, the equation of motion derived from the La- 
grangian (8.71) is 


>) +qv xB, (8.83) 


where p = 7(v)mv. Using eq. (3.83), we see that this is just the Lorentz 
force equation (3.6). The fact that we have derived it from a fully 
covariant action shows again that it is indeed the spatial component a 
fully relativistic equation. 

Finally, we can obtain the Hamiltonian from 


H(P,x)=P-v-L, (8.84) 


where the Lagrangian L is given by eq. (8.71) and v must be written 
in terms of P and A using eqs. (7.138) and (8.78). The inversion of 
eq. (7.138) gives 

a a P (8.85) 


/p2 Fme : 
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and therefore, in terms of P and x, 


P—qA 
v=c 1 . (8.86) 


Ve — qA? + mc? 


Inserting this into eq. (8.84) we get 


H(P,x)=c ve — qA)? +M +46, (8.87) 


where the potentials A (t,x) and ¢(t,x) must be computed on the posi- 
tion x = x(t) of the particle. 


8.7 Field-theoretical approach to classical 
electrodynamics 


This section is more advanced and 
The Lagrangian formulation of classical mechanics is particularly use- should be skipped at first reading 


ful for understanding the relation between symmetries and conservation 
laws and is also the first step toward the Hamiltonian formulation. Be- 
side their intrinsic elegance, the Lagrangian and Hamiltonian formalisms 
are also the natural bridge between the classical and the quantum the- 
ory. Maxwell’s equations also admit a Lagrangian formulation which, 
again, allows us to better understand the formal structure of the theory 
and the relation between symmetries and conservation laws, and is a 
prototype of a classical field theory. Furthermore, even if we will not de- 
velop these aspects in this book, this field-theoretical formulation is also 
the starting point for the quantization of the theory, leading to quantum 
electrodynamics. In this section, we will develop such a field-theoretical 
approach to classical electrodynamics. The subject is advanced, and 
here we will limit ourselves to presenting briefly the main results, refer- 
ring the reader to Maggiore (2005) (see in particular chapters 2 and 3) 
for more detailed discussions and derivations. 


8.7.1 Euler-Lagrange equations of relativistic fields 


Elementary classical mechanics deals with systems with a finite number 
of degrees of freedom. These are described by (generalized) coordinates 
qi(t), where the index i labels all degrees of freedom of the system. 
A typical example is a system of N particles in three dimensions, in 
which case the index 7 takes the values 1,...,3N. Classical field theory 
generalizes this to systems with a continuous set of degrees of freedom. 
In the previous chapters, we have already made frequent use of the notion 
of field. In the context of electrodynamics, the most obvious examples 
are the electric field E(t, x) and the magnetic field B(t, x), or the gauge 
potentials d(t,x) and A(t,x). We can here regard the variable x as 
a continuous generalization of the index i in q;(t), so that we have a 
dynamical degree of freedom, i.e., a function of time, associated with 
each point of space. 
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16 More precisely, the Lagrangian is a 
functional of q(t) and q(t), i.e., an ob- 
ject that depends on the functions q(t) 
and q(t), rather than just on a finite 
number of variables. The correspond- 
ing derivatives should really be defined 
as functional derivatives at the level of 
integrated quantities (such as the ac- 
tion), with the rule 


q(t’) 
dq(t) 


and the standard composition rules for 
derivative, so that, for instance, 


=6(t—t'), (8.93) 


ô / INe 1 
zoa a(t’)i(t’) 
= fa pf Sl) 

E PO 


T 


In practice, this is equivalent to naively 
differentiating at the level of the La- 
grangian, with formal rules as 


ô 
alt’) al) > a(t). 
ôq(t) 
In the following, we will perform the 
manipulations leading to the equations 
of motion in this form. 


t)i) 
(8.94) 


(8.95) 


In a non-relativistic context, it is convenient to organize the dynami- 
cal variables according to their transformation properties under spatial 
rotations. For instance, the 3N degrees of freedom q;(t) of a system of 
N particles in three dimensions, with i = 1,...,3N, are obviously orga- 
nized into a set of N vectors qa, where a = 1,...,N labels the particle 
and, for each a, qa = (qx, qy;,qz)a iS a spatial vector. In the case of 
fields, we have already discussed their transformation properties under 
rotations in Section 7.3.6. For instance, we saw that the temperature is 
an example of a field scalar under rotations, defined by the fact that it 
transforms as in eq. (7.92). Another example, more pertinent to classical 
electrodynamics, is the scalar gauge potential $(t, x). In eq. (7.92), as in 
all equations relative to transformations under rotations, we suppressed 
the time variable t, since we were only interested in the behavior under 
spatial rotations, that do not affect time. More generally, we can write 
the transformation property under rotations of a scalar field, such as 
o(t,x), as 

olt, x) => ¢ (t, x’) = olt, x) : 
The transformation property under rotations of a vector field, such as 


the electric field, is given by eq. (7.94) or, also re-instating the time 
dependence, 


(8.88) 


E;(t, x) => Ex(t, x’) = Rọ Ej (t, x) ; (8.89) 
and similarly for any other vector field. 

In a relativistic context, we are interested in the transformation prop- 
erties under Lorentz transformations, that we discussed in Section 7.3.6. 
In particular, a field scalar under Lorentz transformations (that, when 
the context is clear, we simply call a scalar field) transforms as in 
eq. (7.101), that we rewrite here, 

p(z) > p(x") = g(a). (8.90) 
For rotations, x’ = (t,x’) and we get back eq. (7.92). A contravariant 
four-vector field V” (x) transforms as in eq. (7.102) so, in particular, the 
gauge field A” transforms as 


A” (x) > A” (x') = AM, A’ (x). (8.91) 
Similarly, a tensor field such as F#” transforms as 
F (g) PO ae i AP NY hPa) (8.92) 


Having introduced the basic variables of a field theory, the next step is 
to define their dynamics. This can be done by extending to fields the 
Lagrangian formalism of classical mechanics, and gives rise to the subject 
of classical field theory. To this purpose, we begin by considering a non- 
relativistic system with generalized coordinates q;(t). The Lagrangian 
is a function of the coordinates q; and their time derivatives q;, that we 
denote collectively as g(t) and q(t), respectively, i.e., L = L{q(t), q(t)].1° 
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The action is defined as (compare to eq. (7.124), where we considered 
the case of a single particle) 


ee / MDa) (8.96) 


The action principle is obtained considering the action integrated be- 
tween fixed initial and final times, t; and ty, and studying its variation 
under a change of the trajectory qi(t) > qi(t) + ôqi(t), with dq;(t) = 0 at 
t = t; and at t=ty, i.e., we study how S varies if we change q;(t) while 
keeping it fixed at the initial and final times. If q;(t) > q(t) + 6q;(t), 
then qi(t) > g(t) + d/dt[5q;(t)], i.e., 6g; = d/dt[5q;(t)]. Therefore, the 
variation of the action is 


ts ôL ôL 
6S = / dt [Zoa + sa] 
ti > ôqi Ogi 
tf ôL ôL d 
D a Fes + nt 
tj ôL d é6L 
ie | bee 
Df lia G 697) 


where, in the last line, we integrated d/dt by parts and used the fact 
that, since ĝq;(t) = 0 at t = t; and t = tf, the boundary term vanishes. 
The action principle states that the classical trajectory is such that, 
under such a variation of the q;(t), 6S = 0. Since the variations dq; are 
taken to be independent, each of the terms in the sum over 7 in eq. (8.97) 
must vanish independently and, since this must happen for an arbitrary 
variation dq; (t), we must have 


II 


=O, (8.98) 


for each value of i. These are the equations of motion (or Euler-Lagrange 
equations) of the system. We now want to generalize the action princi- 
ple from a system described by a finite number of mechanical variables 
qi(t), to fields and we want to construct a Lorentz-covariant field the- 
ory. To begin, we consider a single (Lorentz) scalar field (x). For a 
non-relativistic system with generalized coordinates q;(t), we have seen 
that the Lagrangian is a function of q; and qi. However, if we want 
to construct a Lorentz-covariant formalism, the Lagrangian of a scalar 
field cannot be a function of ¢ and Q:¢, since O@ is not a covariant 
quantity. Rather, we must use 0,,¢ which, as we saw in eq. (7.107), isa 
four-vector field. Therefore, the Lagrangian of a scalar field will depend 
on @ and 0,,¢. Note, however, that these quantities depend not only 
on t but also on x. As mentioned previously, we can think of x as an 
“index” that labels the dynamical variables of the theory, corresponding 
to the index i of the variables q;(t) in classical mechanics, except that 
now this label is continuous rather than discrete: we have a dynamical 
variable associated with each point in space. To understand how to deal 
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with this label in the continuous case, consider first the simple case of 
a set of free non-relativistic particles, with masses m; and coordinates 
q(t). The Lagrangian of each one is given by L; = (1/2)miq?, and the 
total Lagrangian is 


ile at 
L= D adi (8.99) 


In the limit in which the discrete index i becomes a continuous variable 
x, the sum over i becomes an integral over d?x. These considerations 
suggest to write the Lagrangian of a scalar field in the form 


= T #PxcL|d, 0,4). (8.100) 
The function L[¢, 0,4] is called the Lagrangian density (although, with 
a common abuse of notation, it is often called simply the Lagrangian), 


and we have defined it extracting a factor of c for later convenience. The 
action is then obtained as 


s = far 


c f ddz Lie, Od] 
[aecto, 0,0) - (8.101) 


Note that, in this way, we reconstructed dfx as integration measure and, 
as we have seen in eq. (8.5), this is Lorentz invariant. Therefore, if we 
take L[ġ, 0,,¢] to be Lorentz invariant, the action will also be Lorentz 
invariant. We will extend the integration over all of space-time, with 
the boundary conditions on @ that it vanishes sufficiently fast both as 
t — +00 and as |x| —> oo, so that we can neglect all boundary terms 
that will emerge from integration by parts. 

We now consider a variation of the field 6 —> o + 6¢. Then 0,¢ > 
Ono + (ð h) with 5(0,,¢) = 3 (ðP), and the variation of the action is 


ôS = I d'z Z ð$ + awe (3,0) 


= fae [Goer C 1060) 


[ate [a aay 8 a 


where, in the last line, we integrated ô, by parts discarding boundary 
terms, given our boundary conditions at infinity. The classical evolution 
of the field is defined by the condition that the classical solution of the 
equations of motion is an extremum of S, i.e., is such that ôS = 0 for 
a generic variation 6¢. This means that the quantity in bracket must 
vanish. We then obtain the Euler-Lagrange equations for a relativistic 
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scalar field, 


ôL ôL 
ee 


5(Au4) (8.103) 


This is the generalization of eq. (8.98) to field theory. As an example, 
the simplest Lagrangian density of a real scalar field is given by!” 


L= -5 (d0"S + p) , 


with u a parameter with dimensions of inverse of length, as 0,,. Then, 


(8.104) 


ôL 
Ti =p’ġ, (8.105) 
while 
dL 
TORS =-0"¢, (8.106) 
so eq. (8.103) becomes 
(-O+p7) =0. (8.107) 


This is called the Klein—Gordon equation. Searching for a solution of 
the form 


(2) = Ape, (8.108) 
and using ĝ ett? = ik et? and Oe*** = —k kef? , we get the condition 
k +p? =0, (8.109) 


where k? = k,,k = —(k°)? + k?. More explicitly, eq. (8.109) then reads 
(k°)? =k? +p’. (8.110) 


In Chapter 9 we will study in detail how similar equations for the gauge 
potential (but without the u? term) give rise to electromagnetic wave 
solutions.1§ 

If we have several scalar fields ¢;(a), the Lagrangian density depends 
on gj and O,,¢; for all values of the index 7, and the variation with respect 
to each of these fields must vanish, so eq. (8.103) simply becomes 


ôL ôL 

vi ` T EN = 0, 
ôĝi ô(ð pi) 

for each value of the index 7. Consider now a four-vector field A,. The 
Lagrangian density will be a function of A, and its derivatives 0, AL, 


L=L[A,,0,A,], and 


(8.111) 


a= [ate LlA,,a,A,)- (8.112) 
The equations of motion are obtained requiring that the variation of the 
action with respect to each of the four components of A, vanish, and 
therefore are the same as eq. (8.111), with ¢; replaced by A,, 


o£ dL 


5A, R.A 


(8.113) 


17The overall factor —1 /2 is irrelevant 
for the equations of motion, but can be 
fixed requiring that the corresponding 
Hamiltonian, or energy density, is cor- 
rectly normalized, see e.g., Section 3.3.1 
of Maggiore (2005). 


18Comparing with eq. (7.146) we see 
that eq. (8.110) has a form analogous to 
the dispersion relation of a relativistic 
massive particle. This, however, only 
becomes true in the context of quan- 
tum theory. In fact, k° and k have 
dimensions of inverse length, and can 
be identified with the energy and mo- 
mentum of a particle only through the 
quantum-mechanical relations E/c = 
ħk? and p = hk. Then, eq. (8.110) 
takes the form (7.147) with the iden- 
tification m = hu/c. 
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19 Recall that our metric is 
ney = (-1,1,1,1). If one uses 
instead the opposite signature, 
ney = (1,-1,-1,-1), the term 
Puy F” = que voF?? FH is un- 


changed because it involves two factors 
of the metric, while Aj = nyy A” j” 
changes sign, so the action becomes 


—E0c (8.117) 


S = 
4 1 HvV H 
x | dr qiwF + uoApj" ) . 


20 Note that, in eq. (8.98), 6L/dq; is the 
derivative with respect to qi at fixed 
qi, while 6L/6q; is the derivative with 
respect to q; at fixed qi. Similarly, in 
eq. (8.111) [or in eq. (8.113)] 6£/5¢; is 
taken at fixed ð pp; and 6£/6(0,¢;) is 
taken at fixed ¢;. 


2l Explicitly, 


6 Fo 
~P (ð An) 
og! OFF 
5(O,,Av) 
_ po 5(Nao! Nap FP) 
5(OnAv) 
a! B’ bP yr gt 
5(O Av) 
6F ug 
ô(ð Av) ` 


= Noo! ge! F 


= Foe (8.120) 


8.7.2 Lagrangian of the electromagnetic field 


We now show that Maxwell’s equations can be derived from an action 
principle, and we identify the Lagrangian of the electromagnetic field. 
We work with a covariant formalism, using A, (x) as our fundamental 
dynamical field. Then, we only need to show that eq. (8.23) is the 
equation of motion derived from a Lagrangian since, as we have seen, the 
other two Maxwell’s equations contained in eq. (8.33) are an automatic 
consequence of the introduction of the gauge potentials. We consider 
the Lagrangian density 


L = Lo + Lint, (8.114) 
where 
E0QC v 
Lo =— EF" (8.115) 
is the Lagrangian of the free electromagnetic field, while 
Lx 14 qh 
int = gu] (8.116) 


describes the interaction of the gauge field with a given external current 
j”. Factorizing a term eoc, the corresponding action is therefore!’ 


1 
S= coc f ae (FFE + po Ani") g 


Note that the interaction action was already obtained in eq. (8.73). To 
derive the equations of motion of this action we must compute the (func- 
tional) derivatives that appear in eq. (8.113). We perform the compu- 
tation in detail. We first compute the derivative of the Lagrangian with 
respect to 0,A,. In this case, only £o contribute, since Lin, depends 
on A,, but not on its derivatives.” First of all, it is necessary to change 
the names of the dummy indices u,v in eq. (8.115), in order not to mix 
them up with the indices u,v that appear in eq. (8.113). We then write 
Lo = —(€0c/4) Fag F°? . Then, 


1 Lo 


(8.118) 


16 (Fap F°®) 


coc 0(0,A,) 4 5(d,A,) 
1 Fag 1, 6Fe8 
= SLL ni ae . (8.11 
10A) a Aoa E 


We now observe that the two terms in the last line are equal, since the 

indices a, 8 can be raised and lowered with the Minkowski metric that 
commutes with the derivatives with respect to any field.?! Then 

1 Lo 1 

coc O(O, AL) 


(8.121) 
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where, in the last line, we used the fact that F°° is antisymmetric under 
a B, so, 


F°7Q,Ay = FQ, Ag 


—F°%9,Az, (8.122) 


where we first renamed the dummy indices as a —> ĝ and 6 —> a and we 
then used F?% = —F°?, We next use?? 


(aA) — gy 
= 8L 8Y 12 
(0,4) 7 505° (8.123) 
and we get 
L A p, (8.124) 


coc 0(0O, Av) 


The derivative of £ with respect to A, at fixed 0, A, is easily evaluated: 
now Lo does not contribute, since it depends only on the derivatives of 
the gauge field, and the contribution only comes from Lint, which gives 


1 dLint Ap \ ou 
a Ho\ sa JI 
coc OA, OA, 
Moon, J” 
= poi” (8.125) 
Inserting these results into eq. (8.113) we finally get 
ð F” = —poj” , (8.126) 


and we have therefore recovered eq. (8.23). This shows that the La- 
grangian given by eqs. (8.114)—(8.116) is indeed the Lagrangian of the 
electromagnetic field interacting with an external current.?° In particu- 
lar, we have found that the action of the free electromagnetic field is 


€9C 


So = dr Fy F” 


4 
- © J dids (E? — 2B?) , (8.127) 
where we used eqs. (8.17) and (8.20), as well as dz? = cdt. Observe that 
the action of the free electromagnetic field is gauge invariant since, under 
the gauge transformation (8.14), Fa» is invariant. The interaction term 
in eq. (8.118) is also gauge invariant. Indeed, applying ô, to eq. (8.126), 
we get 


8,0, F4" = —p90,j” . (8.128) 


However, F”” is antisymmetric in u,v, while the operator 0,0, is sym- 
metric (as usual, we assume that it acts on differentiable functions on 
which the derivatives commute). Then 0,0, F"” vanishes automatically, 
and eq. (8.128) implies 0,7” = 0. This is in fact nothing but the deriva- 
tion of current conservation from the equations of motion, that we have 


22 Recall, from the discussion around 
eqs. (7.104)—(7.108), that the derivative 
with respect to a quantity with lower 
index produces an object with upper in- 
dex, so u, v on the right-hand side of 
eq. (8.123) must be in the upper posi- 
tion. Since 6 (daAg) /5(O, Av) is equal 
to one if a = u and 6 = v and is 
zero otherwise, the result is a product 
of Kronecker deltas. 


23 More precisely, this shows that this is 
a possible choice for such a Lagrangian. 
In general, the Lagrangian that repro- 
duces a given equation of motion is not 
unique. In the classical mechanics of 
non-relativistic systems, if we add a to- 
tal time derivative to the Lagrangian in 
eq. (8.96), the variational principle is 
not affected, since a total time deriva- 
tive in the Lagrangian gives a bound- 
ary term in the action, and the vari- 
ation of the action is computed keep- 
ing qi(t) fixed on the boundary, i.e., at 
t = ti and at t = tf. Similarly, the ad- 
dition to the Lagrangian in eq. (8.101) 
or in eq. (8.112) of a term of the form 
ð K”, with K# an arbitrary function 
of the fields, does not affect the equa- 
tions of motion, because it is a (three- 
dimensional) boundary term. 

Also note that, at the level of 
eq. (8.118), we could multiply the ac- 
tion by an arbitrary multiplicative fac- 
tor without affecting the equations of 
motion. A simple way to fix the correct 
normalization, which is indeed given by 
the terms coc in eq. (8.118), is obtained 
by also including the dynamics of the 
point particles. This just amounts to 
requiring that the interaction term is 
normalized as in eq. (8.73), since we 
already saw in Section 8.6.2 that this 
gives the Lorentz force equation with 
the correct numerical factors. Other- 
wise, we will see in the following how 
the Lagrangian determines the energy 
density (or, more generally, the energy- 
momentum tensor), and then one can 
fix the normalization of the Lagrangian 
requiring the correct normalization for 
the energy density of the electromag- 
netic field, which is given by eq. (3.41). 
The two procedures are of course equiv- 
alent, since the normalization of the en- 
ergy density of the electromagnetic field 
was also obtained by comparison with 
the mechanical energy of point parti- 
cles, see eqs. (3.39) and (3.40). 
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already found in Section 3.2.1 and, in the covariant formalism, in Sec- 
tion 8.1. As we have already seen in eq. (8.75), current conservation 
(together with suitable boundary conditions at infinity, either on @ or 
on the current), implies that the interaction action is gauge invariant. 

In eq. (8.118), or in eq. (8.126), j” is a given current; we have not 
specified the dynamics of the charges that give rise to it, and in this sense 
we referred to it generically as an “external” current. We can develop 
the action principle further, by including the dynamics of these charges. 
For instance, if we consider N classical relativistic point charges, each 
one described by its world-line z¥(r) (with a =1,..., N), then, for each 
charge, the current j¥ will be given by eq. (8.3) (with q, «“(7) and u” 
replaced by qa, v4(7) and u¥, respectively), and the free action of these 
particles will be given by eq. (7.134). Then, the total action describing 
the free dynamics of N charged point particles, the free dynamics of the 
electromagnetic field, and the interaction among the electromagnetic 
field and the particles, is 


ag f d‘x A,,(x)j#(2) , (8.129) 


where 


jez) = da J cdr u” (r) [x — za(T)], (8.130) 


and u(r) = dx#/dr. Carrying out the integral over dfx in the interac- 
tion term using the Dirac delta in j(x), we can rewrite this as 


= -Yme je 1-4 fats Fp FH” 


5% J dr ul(r)A,[ra(T)] . (8.131) 


This is the same action that we considered in Section 8.6.2, see in partic- 
ular eqs. (8.69) and (8.70), except that, there, we considered a particle in 
a given external electromagnetic field, while here we have also included 
the dynamics of the electromagnetic field. 

Actually, the action for the free electromagnetic field that we have 
found in this section is in a sense more fundamental than the point 
particle action, and of the interaction term between A, and a point 
particle. In fact, at the quantum level, the point-like approximation for 
the charged particles is no longer fundamental, and one rather describes 
also the particles in terms of fields. The fundamental Lagrangian then 
consists of the Lagrangian (8.115) for the free electromagnetic field, the 
appropriate Lagrangians for the matter fields (which depend on the spin 
of the charged particles considered), and of suitable interaction terms be- 
tween these matter fields and the electromagnetic field, all constructed 
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so as to preserve gauge invariance (see e.g., Chapter 3 of Maggiore (2005) 
for details). At distance scales sufficiently large, where the particles can 
be considered point-like, these more fundamental descriptions reduce 
to eq. (8.129). This has the conceptually important implication that 
Maxwell’s equations, in the form (3.1)—(3.4) or in the covariant form 
(8.23, 8.33), are not truly fundamental at the scale of elementary par- 
ticles. There, a more fundamental description of the coupling of the 
electromagnetic field to the matter field is required.*4 In any case, since 
at these scales quantum mechanics also enters the game, the subject 
is more appropriately treated directly in the context of quantum field 
theory. 


8.7.3 Noether’s theorem 


Noether’s theorem expresses the relation between symmetries and con- 
servation laws in classical field theory. We discuss it here briefly, follow- 
ing closely Section 3.2 of Maggiore (2005), to which we refer the reader 
for more detailed discussions and derivations.?° We consider a field the- 
ory with fields ¢;(x) and action S. The notation ¢;(a) here is completely 
generic and could refer, for instance, to a set of Lorentz scalar fields or, 
as will be more relevant in our case, to the four components of the 
four-vector field A, (x). We consider an infinitesimal transformation of 
the coordinates and of the fields, parametrized by a set of infinitesimal 


parameters €^, with a = 1,..., N, of the general form 
gh > g" =g" + 6 AM(a), (8.132) 
ila) > plx) = o;(x) + €*Fi,a($, 99), (8.133) 


with A#(x) (not to be confused with the gauge field) a given function 
of the coordinates, and Fj,4(¢,0@) a given functional of the fields and 
of their derivatives. An important distinction is between “global” and 
“local” transformations. A transformation is called global if its param- 
eters €“ are constants; it is called local if they are taken to be arbitrary 
functions of space-time, e° (x). 

As an example, space-time translations are transformation in which 
(both for infinitesimal and finite transformations) the coordinates change 
as 


rt gl = gh + eb, (8.134) 


with e” a constant (so, they are global transformations), while all fields, 
independently of their properties under Lorentz transformations, trans- 
form as “scalars under translations,” i.e., as 

di(x) > O;(2’) = di(z), (8.135) 
as we have already seen in eq. (6.22) for the case of spatial translations, 
see in particular the discussion in Note 3 on page 137. Equation (8.134) 
can be rewritten as 


ch > oh + eV OF. (8.136) 


24A note for the advanced reader. For 
spin 1/2 particles, such as electrons and 
protons, at the level of what is called 
the Dirac Lagrangian, the coupling to 
the electromagnetic field indeed turns 
out to be of the form (8.116), with the 
current j” given in terms of the fields 
describing these particles. In contrast, 
for a spin-O charged particle, an extra 
interaction term, quadratic both in the 
gauge fields and in the field describing 
the particle, is present; see Maggiore 
(2005), eqs. (3.170) and (3.174). This 
extra term gives rise to a correspond- 
ing extra term in the equation of mo- 
tion, which then is no longer of the form 
(8.23). 


25 When comparing the results, note 
that Maggiore (2005) uses units eg = 
Ho = c = 1, and the opposite metric 
signature nuv = (+,—,—,—). These 
conventions are the most common in 
quantum field theory. 
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26The fact that this expression for 
Af, (x), once inserted into eq. (8.139), 
correctly gives eq. (1.153), can be 
shown writing 


Sara — sÍ xt) 


i<j 
= Seat — DAE 
i<j j<i 
- So wit dad + dows? 
i<j i>j 


= =2 es 
= > A 


where, for the second term, in the sec- 
ond line we renamed the dummy indices 
i<j, and we then used w’J = —wJ’, 


Therefore, in this case the index a in eqs. (8.132) and (8.133) is just a 
Lorentz index, and we have 


Another example is given by spatial rotations. On the spatial coordi- 
nates x’, the infinitesimal form of a spatial rotation can be written as 
in eq. (1.153), while 2° > x°. We can rewrite this in the form (8.132), 
where the role of the index a is played by the pair of indices (i, 7) that 
identify the plane in which the spatial rotation is performed, and e° 
identified with the parameters wt! defined in eq. (1.149) (recall that, for 
spatial rotations, we can write the spatial indices equivalently as upper 
or lower indices). Then, eq. (8.132) can be written as 


we > a? +S > wt AR, (2) (8.138) 
i<j 

ck > ak +S "ut AR (a) (8.139) 
i<j 


(where the restriction to 7 < j avoids a double counting, since the param- 
eters wÍ with į > j are fixed in terms of those with i < j by w = —w"), 
with?6 

Ab(z)=0, Ak (x) = d**a? — Sr. (8.140) 
The corresponding expression for F in eq. (8.133) depends on the type 
of field considered. For a field scalar under rotations, such as the scalar 
gauge potential ¢, writing eq. (8.133) as 


d(x) > 6'(a') = d(x) + de Foi (8.141) 


we have Foi; = 0. For a vector field, the finite transformation was given 
in eq. (7.94), so the infinitesimal transformation, written for instance for 
the vector gauge potential A, is 


A* (a2) + A (x') = A*(x) + w A; (a). (8.142) 


Similarly to eqs. (8.139) and (8.140), this can be written as 


A®(a) + A’*(a') )+ +0 wI FEJA (8.143) 
with 
FE [A(x)| = 6" Al (x) — 57* A’ (x) . (8.144) 


One could proceed similarly for the full set of Lorentz transformations, 
but we will limit ourselves to space-time translations and spatial ro- 
tations from which, as we will see, Noether’s theorem will allow us to 
compute the energy, momentum, and angular momentum of a given field 
theory so, in particular, of the electromagnetic field. 

Equations (8.132, 8.133) define a symmetry transformation of the the- 
ory if they leave the action S(¢) invariant, for any configuration of the 
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fields ¢;. Note that we are not assuming that the fields ¢; satisfy the 
classical equations of motion. A symmetry by definition leaves the action 
invariant for every field configuration, solution or not of the equations 
of motion. 

Now, suppose that eqs. (8.132) and (8.133) are a global, but not a 
local, symmetry of our theory, i.e., they leave the action invariant if 
c is constant, but not if we allow e to depend on x. Then, Noether’s 
theorem states that, for each value of the index a = 1,...,N (i.e., for 
each independent parameter of the transformation), there is a conserved 
current j¥, i.e., a current that satisfies 


a,j =0. (8.145) 


This implies the existence of a corresponding set of conserved charges 


C 


Qa = E feai, (8.146) 


In fact, taking the time derivative and using eq. (8.145), 


2. L 1 ferata 


dt 
1 . 
= J daiji (x) , (8.147) 


and this is a boundary term, representing the flux entering and leaving 
the volume of integration. In particular, if we integrate over all space 
with the boundary condition that jf vanishes sufficiently fast at infinity 
(or if we integrate over a finite volume with jt (x) = 0 on the boundary, so 
that there is no incoming or outgoing flux), the charge Q, is conserved. 
All this is identical to the conservation of the electric charge that we 
first found, in a non-covariant formalism, in Section 3.2.1, and that we 
found again, with the covariant formalism, in eqs. (8.10) and (8.11). The 
notations “charge” and “current” in the context of the Noether theorem 
are indeed borrowed from this example. However, as we have seen, the 
index a here can also be a Lorentz index, as in the case of space-time 
translations, or can represent a pair of antisymmetric spatial indices, as 
in the case of rotations, so the Lorentz transformation properties of the 
“currents” ji(x) and “charges” Qa also depend on the nature of this 


Noether’s theorem also provides an explicit expression of the currents 
g(a), in terms of the Lagrangian density of the theory and of the func- 
tions A“ (x) and F;,.(¢,0¢) that enter in eqs. (8.132) and (8.133),?8 


je = om (Ab (2)9,6: — Fiald,06)|-LAB(z). | (8.148) 


27 Also observe that the multiplicative 
factor in front of the charge is, at this 
stage, arbitrary, since, if a quantity is 
conserved, multiplying it by a constant 
it remains conserved. In eq. (8.146) we 
have chosen a multiplicative factor 1/c, 
as in eq. (8.11). 


8See Section 3.2 of Maggiore (2005) 
for a proof of Noether’s theorem and 
a derivation of eq. (8.148). 
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29 As we will see below, the factor —c 
is chosen so as to eventually obtain the 
same normalization as in eq. (8.34). In 
particular, the minus sign is related to 
our signature nuv = (—,+,+,+). As 
already mentioned, by itself Noether’s 
theorem does not fix the normalization 
of the conserved current; if a current is 
conserved, i.e., 0,j4 = 0, any multiple 
of it is also conserved. 


30 Observe that, at this stage, 0#” is not 
necessarily symmetric in u,v, and the 
contraction of the index of the partial 
derivative must be done with the first 
index of 0#”, while v was the equiva- 
lent of the index a in eq. (8.145). We 
will see below how to “improve” the 
energy-momentum tensor so that it be- 
comes symmetric, when the expression 
obtained from eq. (8.150) is not sym- 
metric. 


Energy-momentum tensor of the electromagnetic field from 
Noether’s theorem 


We now specialize the above machinery to translations, which are sym- 
metries of Maxwell’s theory (and of all standard field theories). In this 
case, we have seen that the index a is actually a Lorentz index, so the 
corresponding four conserved currents Ie) form a Lorentz tensor. We 
define 

y= —cft,) . (8.149) 


This is the field-theory definition of the energy-momentum tensor.?9 


Then, inserting eq. (8.137) into eq. (8.148) and raising the v index as 
0EY = nP Ot p, we get 


ôL 
Wv PRERA Hv 
a eS (8.150) 


Equation (8.145) then states that, when 0“” is evaluated on a solution 
of the equations of motion, it satisfies? 


8,0" =0. (8.151) 


According to eq. (8.146), the conserved charge associated with the energy- 
momentum tensor is 


1 
PY = pee as (8.152) 
C 


and this is the definition of four-momentum in classical field theory. 
A field configuration, solution of the equations of motion, carries an 
energy E = P® and a spatial momentum P’ which can be calculated 
using eqs. (8.150) and (8.152). 

We can now apply this to the free electromagnetic field. In this case, 
eq. (8.150) becomes 


ôL 
gH’ = —c | 
6 (On, Ap) 


and £ is given by eq. (8.115). The derivative 6£/6(0,.4,), which appears 
in eq. (8.153), was already computed in eq. (8.124), so we get 


Ə” A, we| (8.153) 


1 
OMY = enc? [-Fwrara, a ji" Foo E” (8.154) 
We next write 0” A, = (0” A, — pA”) +0, A” = F” p + 0,A”. Then 
1 
0 = -oe H + it" FoF | + egc?FHP0,A”. (8.155) 


The first term in this expression is just the energy-momentum T#” that 
was already written in eq. (8.34); as we showed there, this tensor con- 
tains, in a covariant form, the energy density of the electromagnetic field, 
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given by T°, the Poynting vector, given by St = cT”, and the Maxwell 
stress tensor T, and satisfies the conservation equations (8.39) that 
summarizes, in a covariant form, energy conservation and momentum 
conservation. In particular, in the absence of sources, which is the case 
that we are considering here, on the solutions of the equations of motion 
it satisfies 0,7” = 0 (or, since it is symmetric in p,v, 0,T’" = 0 and, 
after renaming the indices u + v, 0,T"” = 0). 

The extra term in eq. (8.155) is, at first sight, quite puzzling, since 
it is not even gauge-invariant (also note that it is not symmetric in 
u,v). However, using the equation of motion ô F”? = 0 (appropriate 
to the fact that we have derived 0#” from Noether’s theorem using the 
Lagrangian of the free electromagnetic field) we see that, under a gauge 
transformation (8.14), it induces a change in 0#” given by 


G” > 0” — eg? FHPO,0"0 


OH” — eoc? 0, (FHPOYO) , (8.156) 


so the conserved charge [i.e., the four-momentum P” given by eq. (8.152)] 
changes as 


PY => PY -e0 f da op (F90) 


= P- coc f de O; (F00) : (8.157) 
The additional term is a total spatial derivative which integrates to zero 
(assuming, as always, that the field decreases sufficiently fast at infinity 
or, in a finite volume, that it vanishes at the boundary of the integration 
volume), so the extra term in eq. (8.155) does not contribute to the four- 
momentum (which then, in particular, is gauge invariant), and 64” gives 
the same charges as the explicitly gauge-invariant tensor T”” so, from 
this point of view, they are physically equivalent. 

In general, Noether’s theorem provides an explicit expression for the 
conserved current, but this need not be unique. For instance, once we 
have found an energy momentum tensor 0#” such that 0,,0"” = 0, we 
can consider the “improved” energy-momentum tensor 


THY = OY + O,APHY , (8.158) 
where A*#” is an arbitrary tensor antisymmetric in the indices p, u. 
This new tensor is still conserved: 0,,0,A°"” = 0 because of the an- 
tisymmetry in p,u. Furthermore, for u = 0, 0,A4°” = 0;A%” is a 
spatial divergence, and therefore this term does not contribute to the 
four-momentum (8.152) if the fields vanish sufficiently fast at infinity. 
This is precisely what happened here, with A?#” = ege? FV? A”.3! We 
then need some physical input to choose the “correct” form, if we want 
to define a local energy density (also see the discussion in Note 11 on 
page 47); in this case, the requirement of gauge invariance selects T”. 
In general, APH” can be chosen so that T#” is symmetric in cases when 
6H” is not.2? 


31 Indeed, using the equation of motion 
pF!’ = 0 appropriate to the fact that 
we are working in vacuum, the term 
Fd, AY in eq. (8.155) can be rewrit- 
ten as p (FPP A”). 


324 note for the advanced reader. 
In General Relativity, the energy- 
momentum tensor is defined in terms 
of a functional derivative of the La- 
grangian with respect to a generic met- 
ric guv. This automatically gives a 
symmetric energy-momentum tensor. 
In the case of electromagnetism in 
curved space, after taking the deriva- 
tive with respect to gy» and special- 
izing guv to the flat-space Minkowski 
metric uv, this procedure gives pre- 
cisely the tensor T#” given in eq. (8.34). 
Eventually, this can be taken as the 
best way of uniquely resolving the am- 
biguity in the definition of the energy- 
momentum tensor. 
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33 Once again, the overall normalization 
of the charge and the current cannot be 
fixed from the Noether theorem since, if 
a quantity is conserved, any multiple of 
it is still conserved, and we fix it so that 
the result will eventually agree with the 
one found in eq. (3.76). In particular, 
we do not insert in eq. (8.159) a fac- 
tor 1/c as in eq. (8.146). The powers 
of c are actually fixed from dimensional 
analysis, in this case by the requirement 
of obtaining a quantity with the dimen- 
sions of angular momentum. 


34 Incidentally, this shows that Ao is 
not a real dynamical variable, since its 
time derivative does not appear in the 
Lagrangian. This has important impli- 
cations for the Hamiltonian formalism 
and for the quantization of the theory, 
see Maggiore (2005), Section 4.3.2. 


35 4 note for the advanced reader. The 
first term in eq. (8.163) depends only 
on the transformation of the coordi- 
nates, through the term Af, (x). A sim- 
ilar term will appear for any field the- 
ory, independently of the transforma- 
tion properties of the field. Upon quan- 
tization, it corresponds to the orbital 
angular momentum of the field, as can 
already be realized from the appearance 
of the operator (xij — xjði) which, in 
quantum mechanics, is related to the 
angular momentum operator, 


L; = — thei 5h 5 OK 


—(i/2)heijn (xj Op = x,0;) . 


The term (E;Aj — Ej;Aj;), instead, 
comes from the term proportional to 
Fi,ij(A) in eq. (8.162), and is there- 
fore specific to the vector nature of the 
field A. Upon quantization, it cor- 
responds to the spin part of the to- 
tal angular momentum. See Maggiore 
(2005), eqs. (4.81)-(4.90). 


Angular momentum of the electromagnetic field from 
Noether’s theorem 


We can repeat the same procedure for the invariance under rotations, 
using eqs. (8.138)—(8.144). We write the “charge” associated with rota- 
tions in the (i, j) plane as J;j, s03 


as J de j9. (8.159) 
The corresponding angular momentum is then given by 
1 
Ji = getik Sik - (8.160) 


We now specialize to the electromagnetic field, adding a subscript “em” 
to the various quantities. Then eq. (8.148) gives 


. ôL 
(jem); = 5(A,) [AF (2) Op-Av — Fpa (A)| — LAR, (2). (8.161) 
From eq. (8.124), 5£/6(09A,) = —eocF°”. It therefore vanishes for 


v = 0, and the only contribution comes when v is a spatial index, that 
we denote by 1.34 Using eq. (8.17), 6£/5(09A1) = —eoH'. Furthermore, 
from eq. (8.140), A?;(z) = 0. Then, 

(jem) = —e0 E; [Af (2) On At — Fi,ij(A)] - (8.162) 


Inserting the explicit expressions of Ai, and F; ij from eqs. (8.140) and 
(8.144), we get?° 


(Jem) ij = co f ae [Ey (x40; = zjð;)Aı + (BE; A; = E;Aj)] : (8.163) 
We can further manipulate the second term in this expression writing 


since xi = ôu. We next integrate ô, by parts and use the fact that, in 
vacuum E; = V-E = 0. Then, neglecting the boundary terms, 


pee (E;A; — Ej Ai) = [ee Bind, + zjð Ai). (8.165) 
Inserting this into eq. (8.163) and using eq. (8.160), we get 
(Jne = eo f Pe esyrzi(ðjA ~ DLA): 
= co f ax ejstiesin BE 
= co f Pa eizi (EXB); 
m o f èx [xx(ExB)], . (8.166) 


This agrees with eq. (3.76). We have therefore recovered, using Noether’s 
theorem, the expression for the angular momentum of the electromag- 
netic field that we found in eq. (3.76) by extracting a conservation law 
directly from Maxwell’s equation. The derivation from Noether’s theo- 
rem emphasizes that angular momentum is the conserved quantity as- 
sociated with the invariance under spatial rotations. 
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Problem 8.1. Lorentz invariance of the charge associated with a 
current distribution 


In eq. (8.11) we found that the electric charge associated with a generic 
current distribution can be written as 


_1 f z 0 
Q= tfa tI (CX); (8.167) 


and we anticipated that this expression is Lorentz invariant. This is not ev- 
ident a priori, since the integral is only over the spatial variables, and the 
integrand is the 44 = 0 component of a four-vector. However, the Lorentz 
invariance of this expression can be proven as follows.’ Equation (8.167) is 
written with respect to a specific reference frame, let’s call it K, which uses 
coordinates (t,x). We then introduce a four-vector n” that, in this reference 
frame, is given by n” = (—1,0,0,0), or, equivalently, n, = (1,0,0,0). We 
next observe that eq. (8.167) can be rewritten as 


Qat f atate) 8, Bia), (8.168) 


where @ is the Heaviside theta function (1.67). In fact, in the K reference 
frame, 0(nvx”) = (x°), so 


8, [O(rv2”)] = 0,0(a") 
= 626(2°), (8.169) 


where 5(x°) is the Dirac delta, and we used eq. (1.70). Inserted into eq. (8.168), 
eq. (8.169) gives back eq. (8.167). However, the form (8.168) is more con- 
venient to study the behavior of Q under Lorentz transformations. Using 
ð j” = 0, we can rewrite this as 


Qa tf dz, G E l). (8.170) 


Note that, even if the current j” (x) = j” (t,x) is localized in space, it is not 
localized in time (i.e., we are not assuming that the four-current vanishes at 
t — too), so the integral in eq. (8.170) can be reduced to a boundary term, 
but this boundary term would be non-vanishing. We will then still keep it, 
for the moment, in the form of a four-dimensional integral. 

We now perform a transformation to a new reference frame K’. Then, 
x" + a!" = AM ,a”, see eq. (7.43), and similarly ny > nj, = Ap”nv. Denoting 
the value of the transformed charge by Q’, we therefore have 


at fata Fo E a] (8171) 


Oa! 
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36 We follow the derivation given in Sec- 
tion 6 of Weinberg (1972). 
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37 This part is more mathematical and 
can be skipped at first reading. Proofs 
and extended discussions can be found 
in most textbook dealing with Rieman- 
nian geometry. 


However, in eq. (8.171), x” is a dummy integration variable, that we can 
just rename x” (note that here we also make use of the fact that we are 
integrating over all of space-time, so the integration domain is unchanged 
under the transformation z” — A“,a”). Therefore, in the frame K’, the 
value of the charge is 


Q == fated, G E e] (8.172) 
and 1 
Of -Q= 1 fazo, {j#(«) [a(n!x”) — 0(a”)]} . (8.173) 


The crucial point is that the difference of theta functions in the square brackets 
vanishes at t => +oo at fixed x; actually, for fixed |x|, it is even a function 
with compact support in t, since it becomes exactly zero when 2° is sufficiently 
large, so that both nix” and n x” are positive, or when xo is negative and 
sufficiently large in absolute value, so that nix” and nyg” are both negative. 
Since j” (x) vanishes at |x| — oo at fixed t (because we assume that the 
current is localized in space, or at least decreases sufficiently fast as |x| + oo), 
the integrand in eq. (8.173) now vanishes on the whole boundary of the four- 
dimensional integration region. Therefore, now Gauss’s theorem ensures that 
Q’ — Q =0, so Q is Lorentz invariant. 

Having established the Lorentz invariance of Q, we can search for an ex- 
plicitly Lorentz-invariant expression that reduces to eq. (8.11) in the K frame 
which uses the variables (t,x). This will then give an expression for Q valid 
in any frame. To this purpose, we define a covariant four-vector aS, from 
the condition that, in the frame K where eq. (8.167) holds, 


d°S,, = d°x(1,0,0,0). (8.174) 


In a generic frame, d? S, is then obtained transforming it as a covariant four- 
vector. Then, eq. (8.167) can be written in an explicitly Lorentz-invariant 
form as 


Q= 1 fas, j". (8.175) 


An explicitly covariant expression from d Sy can be obtained as follows.” 


First, as a simpler example, consider the integration along a one-dimensional 
curve embedded in four-dimensional Minkowski space. The curve can be 
parametrized in terms of a single variable £, so that the position in space-time 
of a point along the curve is given by assigning a function x” (¢). For instance, 
if x“ (€) describes the trajectory of a massive particle, a natural choice for € 
is the proper time 7, as we discussed in Section 7.4.1. Then, for instance, 
the line integral of a four-vector field V” (x), along the curve C defined by the 
function x” (£), can be written as 


f avala) = f ag E Eve). (8.176) 
c c 0g 

This can be generalized to two-dimensional surfaces, or three-dimensional vol- 
umes, embedded in four-dimensional Minkowski space. A two-dimensional 
surface is parametrized by two parameters (€1,€2), so that the position in 
four-dimensional space of the point of the surface identified by (£1, €2) is given 
assigning the functions «"(€1, 2). The two-dimensional surface element d*s,1 
can then be written as 


i A(a?,2”) 
2! her? (Er, £2) 


dig dé ,dés , (8.177) 


where O(a? x7) /O(€i, £2) is the Jacobian of the transformation, 


Oa? Ox? 
heey a = ae (8.178) 
Di Bi (Be 


Similarly, a point in a three-dimensional volume embedded in four-dimensional 
Minkowski space can be parametrized by three parameters € = (1, 2, 3), so 
that the position in four-dimensional space of the point of the surface identified 
by (£1, €2,€3) is given by the functions x" (£1, €2,€3). The volume element can 
then be written, in an explicitly covariant form, as 


1 O(a”, x? , 7) 


~ 31e “OE, bo, &3) 


a 5, dé, dE2dE3 , (8.179) 


where, again O(a”, x”, x7) /O(E1, £2, €3) is the Jacobian of the transformation. 
In the frame where the volume is parametrized by the choices «1(€) = &1, 
r’ (E) = & and z? (E) = 3, we get d?S,, = dédé2d€3(1,0,0,0), i.e., we get 
back eq. (8.174), with £; identified with 2;. 


Problem 8.2. Covariance of f Pa TY 


We can proceed similarly to show that, given an energy-momentum T'” 
that satisfies 0,T"” = 0, the quantity defined by 


PY = x f ae Ye (8.180) 


is indeed a four-vector, as the notation P” suggests. To this purpose, pro- 
ceeding as in eqs. (8.168)—(8.170), we rewrite it as 


v 1 4 Hv 
Po tfa x ô, [T"" (x) O(nx)] , (8.181) 


where nz = npx”. Then, under a Lorentz-transformation, PY + P, where 


P” = L f dca” od EP Ola a) , (8.182) 


and we used the fact that 0, [T"”(«x) 0(nx)] has a single free index v and 
therefore transforms with a single matrix A”,. Then 


pl! — pv = : f d'z ô, [A” oT"? (a) O(n’) — 8T" (x) O(na)] . (8.183) 
We now consider an infinitesimal Lorentz transformation, A”, = 6% +w” o, so 
P- pP = La f d'z 8, {T° (x) [6(n'x) — 6(na))} 

tuet f d x ô, [T"" (x) 0(n'x)] . (8.184) 
The term on the right-hand side in the first line vanishes by the same argument 
that we used for the electric charge: T” is localized in space, while the 


difference of theta functions vanishes at large |xo| and fixed |x|, so Gauss’s 
theorem implies that this integral is a vanishing boundary term. In the second 
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line, to first order in w we can replace n’ by n, since we have already a factor 
w”, in front. Then, 


p= pY = wae f dtaa, [T"? (x) O(nz)| 
E (8.185) 


This is precisely the transformation of a four-vector under infinitesimal Lorentz 
transformation. Constructing a finite Lorentz transformation as a sequence of 
infinitesimal transformations, this implies that P” transforms as a four-vector 
also under finite Lorentz transformation. Therefore P”, defined by eq. (8.180), 
is indeed a four-vector. 

Similarly to eq. (8.175), we can then write it in an explicitly Lorentz- 
covariant form as 


PY = Lfs, qe. (8.186) 


where, as in eq. (8.174), aS, is defined from the condition that, in the frame 
K where eq. (8.48) holds, d*S,, = d*x(1,0,0,0), and, in a generic frame, it is 
obtained requiring that it transforms as a four-vector. Its covariant expression 
is given in eq. (8.179). 


Problem 8.3. Relativistic motion in a constant electric field 


We compute here the relativistic motion of a charged particle in a constant 
electric field. Setting E = Fx, the equation of motion (3.6), that, as we have 
seen in this chapter, when supplemented by p = 7(v)mv is fully relativistic, 


becomes d 
P r 
— = qE. 8.187 
qe 7 Ex (8.187) 
Setting the initial condition p(t = 0) = 0, we then have 
Pz(t) =qEt, (8.188) 


while p, (t) = p-(t) = 0. Writing p; = ymvi, we therefore have vy (t) = vz (t) = 
0 and, denoting vz(t) simply as v(t), eq. (8.188) becomes 


A e (8.189) 
1 — v?(t)/c? 

This can be solved for v(t), obtaining 

aope ane (8.190) 
c m?c? + (qEt)? 
In the limit qEt < mc this reduces to 
qEt 

tha 191 
v(t) 1, (8.191) 


which is the non-relativistic result for a particle subject to the constant force 
F = qE. However, the right-hand side of eq. (8.190) is always smaller than 
one, so v(t) is always smaller than c and, as t > oo, v(t) > c. Writing 
v(t) = dx/dt, and integrating eq. (8.190) with the initial condition «(0) = 0, 


gives 
me g Et? 
t) = 14 1], .192 
o= Ti (it oa (8.192) 


which interpolates between the non-relativistic behavior 


a(t) > sat, (8.193) 
(with a = qE/m) at small t, and the ultra-relativistic behavior x(t) ~ ct at 
large t. 

As we will see in Chapter 10, an accelerated particle actually radiates elec- 
tromagnetic waves, and therefore loses energy. The above computation is 
therefore valid only as long as the associated energy losses can be neglected. 


Problem 8.4. Relativistic motion in a constant magnetic field 
We next consider the motion of a massive particle in a constant magnetic 


field B. The relativistic equation of motion (3.6), together with p = myv, 
now gives 


mow) = qvxB. (8.194) 
Taking the scalar product with v and using the fact that v-(vxB) = 0 we get 
d(yv) 
: =0. .1 
di 0 (8.195) 
This implies that 
dy | dv 
i i (Gv ae x 
ody 1 dv? 
= di + 2” ai (8.196) 
Using 
ey. af, ¢\ 
dt dt E? 
3 
y? dv 
= ——, Sl 
2c? dt ’ (8:147) 
eq. (8.196) becomes 5 
3 dv 
— =0 8.198 
ye (8.198) 
and therefore 
dv? _ 4 (8.199) 
dt 


The modulus of the velocity of a particle in a magnetic field therefore stays 
constant, even in the full relativistic setting. Since the energy € is just a 
function of v? through € = ymc’, the energy is also constant. This result 
could have also been obtained from the u = 0 component of eq. (8.62), which 
is given explicitly by eq. (8.66), and becomes dE /dt = 0 when E = 0. 

Since v? is constant, also y is constant, and we can extract it from the time 
derivative in eq. (8.194). Setting B = Bz, we then obtain 


w wy ek, (8.200) 
where 
qB 
w = — 


E (8.201) 
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or, in terms of the energy E = ymc? of the particle, 


qB 
= ; 8.202 
w= B (8.202) 
In the non-relativistic limit, eq. (8.201) becomes 
w= a2 F (non-relativistic) , (8.203) 


which is also called the cyclotron frequency. In components, eq. (8.200) reads 


dux 


a m n (8.204) 
a <a. alin (8.205) 


while dv/dt = 0. The solution is vz constant, and (choosing for definiteness 
the origin of time so that vs = 0 when t = 0) 


vlt) = v sinwt, (8.206) 
vy(t) = vicoswt, (8.207) 


where v, is a constant, which represents the modulus of the velocity in the 
(x,y) plane, so that v? = v3 + v2 (as we have seen, both v? and v? are 
constant). A further integration gives 


x(t) = x2- = cos wt , (8.208) 


y(t) = yo+—sinwt, (8.209) 


so the particle moves with angular velocity w, given by eq. (8.201), on a circle 
in the (x, y) plane, centered in a generic point (xo, yo) and of radius r = v1 /|w]. 
From eq. (8.201), we get 
p= ML 
\q|B 
For a given velocity v1 of the particle in the (x, y) plane, the larger is the values 
of B, the smaller is the radius of the circle to which the particle’s motion is 
confined. With a non-vanishing value of vz, we have also z(t) = zo + vzt, and 
the motion in three-dimensional space is actually helicoidal. 

Just as for the motion in an electric field, we have here neglected the fact 
that an accelerated particle radiates electromagnetic waves so, again, the pre- 
vious computation is valid only as long as the associated energy losses can be 
neglected. For a non-relativistic particle in circular motion we will compute 
their effect in Problem 10.2, while the radiation emitted in the full relativistic 
setting will be discussed in Section 10.6.3. 


(8.210) 


Electromagnetic waves in 
vacuum 


In this chapter, we study Maxwell’s equations in the absence of sources. 
We will then discover that, even in the absence of sources, there are 
non-trivial solutions that describe electromagnetic waves propagating 
across empty space. We will study them in the Lorenz gauge and in the 
Coulomb gauge. The former treatment allows us to maintain Lorentz 
covariance explicitly at each stage. However, it will be more subtle to 
understand how to eliminate spurious polarizations, and remain with 
the two physical polarizations that characterize electromagnetic waves. 
Working in the Coulomb gauge, in contrast, we will deal from the begin- 
ning with only the two physical polarizations, at the price of losing ex- 
plicit Lorentz covariance in the intermediate steps. The two approaches 
are complementary, technically and conceptually, and are both impor- 
tant to understanding electromagnetic waves.! We begin, however, with 
a discussion of wave equations in a simpler setting, involving only a 
Lorentz scalar field, rather than the full electromagnetic field. 


9.1 Wave equations 


Let us begin by studying an equation of the form 


sar + "| f=0, 


f c? Ot? (31) 
for some scalar function f(x) = f(t,x). It is interesting to compare this 
equation to a Laplace equation V? f (x) = 0. If we set the boundary 
condition that f(x) vanishes as |x| — oo, it can be proved that the only 
solution of V° f (x) = 0 is f = 0. In order to have a non-vanishing 
solution, a source term is needed, as in eq. (3.93), and in this case we 
saw that (for a localized source) the solution is given by eq. (4.16). In 
contrast, because of the opposite sign of the time and space derivatives, 
eq. (9.10) has non-trivial solutions even in the absence of sources. 

In a space-time with just one spatial dimension, the most general wave 
solution can be found quite easily: consider the equation 


E ə? 


-it 5a Fitir =0, (9.2) 


for a function of two variables t and z. x + ct and 


Defining z+ = 


217 


Electromagnetic waves in the 
Lorenz gauge 219 


Wave equations 


Electromagnetic waves in the 
Coulomb gauge 222 
Solutions for E and B 224 
Polarization of light 228 


Doppler effect and light aber- 
ration 229 


lThe interplay between working with 
the physical degrees of freedom at the 
price of losing explicit Lorentz covari- 
ance, or using a Lorentz-covariant for- 
malism at the price of having to deal 
with spurious degrees of freedoms, is 
also a recurrent theme in the quanti- 
zation of electrodynamics. 
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2 Explicitly, 


tka == ikpa”’ 
One = er 


and 


eke _ gape 
= OM(ikyel**) 
= —kHk,et® 


—pp2eike 


3Qbserve that the Minkowskian signa- 
ture of the wave equation was essen- 
tial to have a non-trivial solution. The 
same procedure applied to the equation 
V? f(x) = 0 would give k? = 0 which, 
for k? = k? + kə + k2, has only the so- 
lution k = 0. The Fourier mode with 
k = 0 corresponds to a solution con- 
stant in space, which is eliminated (in 
the sense that its coefficient fk is set to 
zero) by the boundary condition that 
f(x) vanishes at |x| > oo. 


ð+ = 0/Ox+, this equation can be rewritten as 


0,0_ f (a+, x—) = 0. (9.3) 
The most general solution is 


f(v+,a_-) = filz) + fa(e+), (9.4) 


for arbitrary functions fı and fə. A function fı(x—) = fi (a# — ct) de- 
scribes a wave moving rightward with speed c: if this function at t = 0 
has a given value in x = 0, at a subsequent time t = to it will have 
the same value in x = cto, so the whole function is simply translated, 
and advances with speed c toward the positive direction of the x axis. 
Similarly, fo(7+) describes a left-moving wave, which again travels with 
speed c. 

If we have a function of all three spatial variables plus time, f(t, x), 
that obeys Of = 0, there are therefore particular solutions of the form 
f(t,x2,y,2z) = fi(a + ct), independent of the (y,z) coordinates, that 
describe a plane wavefront that moves in the a direction (leftward or 
rightward) at the speed of light. To study the most general wave-like 
solution, in more than one spatial dimension, it is convenient to use 
Fourier analysis, that we already introduced in Section 1.5. In particular, 
any function f(x) (subject to conditions such as belonging to L1(R°), 
the space of functions whose absolute value is integrable over R3 ) can be 
decomposed in a superposition of modes e’** with coefficients f(k), as 
in eq. (1.100), and similarly any function of time [again, belonging e.g., 
to L!(R)] can be decomposed as in eq. (1.104). For a function of the 
four-dimensional coordinates x” = (x°,x), the Fourier decomposition 
can be written as in eq. (1.105) that, in a four-vector notation, reads 


4 ~ i 
o= f ore oee, (9.5) 


where k” = (k°,k) is a four-vector and ka = kg”. If f(x) is real, then 
f*(k) = f(—k). For a scalar function f(a), let us search a solution of 
f = 0 by making the ansatz 


f2) = feet, (9.6) 


where fẹ is the amplitude (which cannot be determined by the equation, 
since it is just an overall constant). We then insert eq. (9.6) into Of = 0 
to see if, and under what conditions on k, this is indeed a solution. We 
observe that? 


Oe ike, (9.7) 


and 


tka __ — keik? 


e ; (9.8) 


where k? = kk. We see that e’*” is a solution of De*” = 0 if, and 
only if, k? = 0, i.e., —(k°)? + k? = 0,3 so we get 


(k?) =k? (9.9) 
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Comparing with eq. (7.146) and identifying k? with E/c and k with p 
(apart from a common overall constant which is also needed for dimen- 
sional reasons since, from eq. (9.5), k” has dimensions of inverse length), 
we see that this is formally the same as the relation between energy and 
momentum of a massless particle. To develop the connection between 
fields and particles the formalism of quantum field theory is really needed 
so, within our classical context, we cannot push this analogy too much.* 
At the quantum field theory level, however, one indeed finds that the 
quanta of a field that obeys an equation such as Of = 0 correspond to 
massless particles. 


9.2 Electromagnetic waves in the Lorenz 
gauge 


As we have seen, the introduction of the gauge potentials and the use 
of the covariant formalism has the advantage of explicitly unveiling the 
Lorentz symmetry of electromagnetism. It also has the technical advan- 
tage of considerably simplifying the equations. We will therefore adopt 
it for our treatment of electromagnetic waves.” We have seen that, be- 
fore making any choice of gauge, the first couple of Maxwell’s equations 
is given by eq. (8.28). In this form, each of the four equations (cor- 
responding to v = 0,1,2,3) involves all four components of A”. Simi- 
larly, the original Maxwell’s equations in the form (3.8)—(3.11) mix the 
components of E and B. However, once formulated in terms of gauge 
potentials, electromagnetism is invariant under the gauge transforma- 
tion (8.14), and we can fix this freedom to impose the Lorenz gauge 
0, AY = 0, see eq. (8.29). In this gauge, eq. (8.28) becomes simply 
eq. (8.30), where the four components of A“ are decoupled. We now set 
j” = 0, so we study this equation in the absence of sources, 


1 8 
c? Ot? 


(9.10) 


Ars | +v] ar =o, 
For the electromagnetic potential A“(a) the solution is slightly more 
complex compared to eq. (9.6), because we must take into account that, 
besides the x dependence, A“ also carries a four-vector index. Therefore, 
we rather look for elementary solutions of the form 


AH (x) = Ap” (k) 

= Apt (kje tter, (9.11) 
where A; is the amplitude and, in the last line, we defined w from 
k? = w/c. The four-vector e”, called the polarization four-vector, car- 
ries the Lorentz index and is normalized as |n,,,e"e”| = 1 (except in the 
special case in which it is a null vector, i.e., nue” e” = 0, which will be 
analyzed separately below). The dependence on a = (ct, x) is only car- 
ried by the exponential. A wave with the simple temporal dependence 
given in eq. (9.11) is called monochromatic, since it has just a single 


“As already discussed in Note 18 
on page 201, the correct quantum- 
mechanical relations are in fact E/c = 
ħk? and p = hk, and require the re- 
duced Planck constant A that enters in 
quantum mechanics. 


>The gauge potential A” also turns out 
to be the fundamental quantity for de- 
scribing the electromagnetic field at the 
quantum level, a subject in which, how- 
ever, we cannot enter in this course. 
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frequency. Its spatial dependence corresponds to a plane wave, i.e., a 
wave that is constant in the direction transverse to the propagation di- 
rection k. More general functions are obtained by superpositions of the 
form (9.5), with arbitrary coefficients, as we will see in more details be- 
low. Since plane waves extend indefinitely in the transverse direction, 
they are a mathematical idealization, and realistic wave solutions are 
obtained from a superposition of plane waves which results in a finite 
extent in the transverse direction. Note that, since the exponential de- 
pends on k, a priori, to obtain a solution, we must admit the possibility 
of a dependence of e” on k, too. Once again, the reality condition will be 
taken care of by the superposition with the complex conjugate solution 
AŻ Jek eme, 

Inserting the ansatz (9.11) into eq. (9.10), €”, which is independent of 
x, simply goes through the O operator and, using eq. (9.8), we get again 
the condition 


k =0, (9.12) 


just as in the scalar case discussed in Section 9.1. However, we are not 
done yet, since eq. (9.10) was obtained imposing that A” satisfies the 
Lorenz gauge condition 0,,A" = 0, so our ansatz must also satisfy it. 
Using eq. (9.7) 

ple” (k)e™"] = ikpe” (k)? , (9.13) 


and therefore we must require 


kut (k) = nurk” (k) =0. (9.14) 


We see that e” depends indeed on k: it must be orthogonal (with respect 
to the scalar product defined by the metric nuv) to k”. 

Apparently, we have found that, for a given k#, there are three in- 
dependent solutions, corresponding to the three independent solutions 
of eq. (9.14). For instance, with a rotation we can always set k along 
the positive z axis; then, since k? = 0, we have k, = +ko (choosing for 
definiteness k° > 0; the case k? < 0 can be treated in the same way) 
and 

k” = k?(1,0,0,1). (9.15) 


Given this form of k”, two solutions of eq. (9.14) are immediately found, 
and are given by 


€() (K) = (0, 1, 0, 0) , e(o (K) = (0, 0, I, 0) , (9.16) 


where the normalization factors have been chosen so that n, ee” = 1. 
These are called transverse polarizations, since the corresponding spatial 
vectors €(3) = (1,0,0) and e} = (0,1,0) are orthogonal, in a three- 
dimensional sense, to the vector k = (0,0, 1). 
The third solution, recalling the expression 7,,, = (—1,1,1,1) for the 
metric, is 
€(3) (k) = (1,0,0,1). (9.17) 
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Notice that €(3) (k) x k”, and the fact that €(3 (k)k, = 0 can also be seen 
as a consequence of k“k,, = 0. In particular, fer the spatial components, 
we have €(3) x k. This is called a longitudinal polarization. The four- 
vector (9.17) has zero norm, nave” = 0, so it cannot be normalized 
imposing |7,,,¢"e”| = 1. We then simply choose eq. (9.17) as our third 
solution [any rescaling of it then corresponds to a redefinition of the 
amplitude Ax associated with this solution in eq. (9.11)]. However, the 
longitudinal polarization is non-physical and can be eliminated with a 
further gauge transformation. In fact, after having reached the Lorenz 
gauge 0,,A" = 0, we can perform a further gauge transformation 


AM A" = AP — 8H6, (9.18) 


and, if we choose 0 such that 06 = 0, we still have 0,A’ = 0. In 
other words, the Lorenz gauge condition does not completely fix the 
freedom of performing gauge transformations, and we can still make a 
further gauge transformation with 0 such that O00 = 0. Working in 
Fourier space, we can therefore consider a function 6(x) = ae’**, with 
a an arbitrary complex constant (once again, the reality condition is 
ensured by the complex conjugate term in the superposition of modes). 
Requiring 00 = 0 gives again k? = 0. Under this gauge transformation, 


A! — A! — 8#(ae'*®) = AY — iaket? . (9.19) 
Therefore, a solution proportional to k“e’**, with k? = 0, can always be 
removed with a residual gauge transformation. We can therefore set it 
to zero without loss of generality.® 

In conclusion, in vacuum, electromagnetic waves are described by a su- 
perposition of terms of the form (9.11) and of its complex conjugate, with 
k? = 0 and k,e” (k) = 0. The latter condition admits as solutions the 
two transverse polarizations and a longitudinal polarization e” (k) x k#; 
however, the latter can be set to zero with a residual gauge transfor- 
mation, so the electromagnetic field has only two degrees of freedom, 
corresponding to the two transverse polarizations. In a frame where 
k” is given by eq. (9.15), a basis for these two transverse polarization 
is given by eq. (9.16). This is called the basis of linear polarizations. 
Another useful basis is given by circular polarizations, defined as 


1 

H pai 
qas V2 

Given that the temporal components of these four-vector vanish, we 


can more simply say that, in the frame where k = (0,0, k), the linear 
polarizations are 


(0,1,4,0), e \(k) = (9.20) 


êa) = (1, 0, 0) , €(2) = (0, I, 0) , (9.21) 
while the circular polarizations are 
` 1 , 1 . 
e4) = yah 0) , ej- j = va —t, 0) (9.22) 


6If one had not noticed the existence of 
this residual gauge transformation and 
would have kept the solution propor- 
tional to €(3) (k) in the computation of 
gauge invariant quantities such as the 
electric and magnetic field, one would 
have found that the contributions pro- 
portional to €(3) “miracolously” cancel. 
This cancelation is simply due to the 
fact that the €(3) (k) term can be set to 
zero with a gauge transformation, and 
therefore cannot affect gauge-invariant 
quantities. 
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We closely follow Maggiore (2005), 
Section 3.5.2. 


The physical polarization vectors are therefore orthogonal (in a three- 
dimensional sense) to the propagation direction k. In Section 9.5 we will 
discuss linear, circular (and elliptic) polarizations further, and we will 
understand the reason for these names. 


9.3 Electromagnetic waves in the 
Coulomb gauge 
It is interesting to compare these results with an analysis in the Coulomb 


gauge, V-A = 0. In this case, the relevant equations are (3.93) and 
(3.94) that, in vacuum, become 


V? = 0, (9.23) 
1_0 
A = ave. (9.24) 


With the boundary condition that ¢ vanishes at spatial infinity, the only 
solution of eq. (9.23) is ¢ = 0. Thus, in vacuum, we can set 


e=0, V-A=0. (9.25) 


In fact, we could have also reached this conclusion with a more complete 
choice of gauge, as follows.” First of all, starting from a generic field 
configuration A,,, we can find a gauge transformation A, — A/, such 
that Aj = 0. It is given simply by 


$ 
A, (t, x) — A, (t,x) = Ap(t,x) - 8, f cdt Ag(t’, x) , (9.26) 
since 
t 
Ajlt,x) = atx) -a f dt’ Ao(t',x) 
= 0. (9.27) 


After that, we still have the freedom of performing a gauge transforma- 
tion with 0 independent of t, because this does not modify the condition 
Ap = 0. We then perform a further gauge transformation which sends 
A’, into a new field Aj, = Aj, — 0,0, choosing 


7 By dA (ty) 
(x) = las aye (9.28) 


Despite the dependence on time of APs, the integral on the right- 
hand side is actually independent of t. In fact, in this gauge Et = —0, A", 
since Aj = 0. Then the vacuum Maxwell equation 0,E* = 0 implies 
ð; A” = 0. It then follows from eq. (9.28) that 0,0 = 0. 

Furthermore, from A’ = A‘ — 0,0 it follows that 


V-A”=V-A'-V’"6. (9.29) 
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Using in eq. (9.28) the identity 


Vi (qe) E-n, (9.30) 


An|x — y| 
that we derived in eq. (1.90), we get 
V70=V-A’, (9.31) 


and therefore V - A” = 0. We have therefore used the gauge freedom to 
set 


Ap = 0 V-A=0, (9.32) 


(where we have eventually removed the double primes on A). This gauge 
is called the radiation gauge. Note that it implies the Lorenz gauge 
0, AY = 0, as well as the Coulomb gauge V- A = 0. Thus, both the 
Lorenz and the Coulomb gauge do not fix the gauge freedom completely. 
In contrast, in the radiation gauge there is no residual gauge freedom. 

In any case, whether we directly fix the radiation gauge, or else we only 
fix the Coulomb gauge and then discover that ¢ = 0 from the solution 
of eq. (9.23), we end up with eqs. (9.24) and (9.25). Since ¢ = 0, the 
former simply becomes OA = 0, so, in the end, A(t,x) must solve the 
two equations 


A = 0, (9.33) 
VA = 0. (9.34) 
We can then proceed in a way completely analogous to the discussion 


in Section 9.2. Working in Fourier space, the solutions of eq. (9.33) are 
superpositions of 


A(t, x) = Ay &(k)e"*” , (9.35) 


and its complex conjugate, with k? = 0, i.e., (k°)? = |k|?. The quantity 
A; is the amplitude, while the polarization vector is normalized as 


&(k)-6*(k) = 1, (9.36) 


which becomes simply ê(k)-ê(k) = 1 if we take a real polarization vector. 
Equation (9.34) requires that 


&(k):k =0. (9.37) 


Therefore, we again find the two transverse polarizations that we have 
already found in the Lorentz gauge. The most general solution is given 
by a superposition of the elementary solutions, labeled by k and by an 
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index \ = 1,2 that identifies the two independent solutions of eq. (9.37), 
with arbitrary amplitudes A, , 


dk ; ; 
A(t,x) -| E> 5 [ê(k, A) Ax, ae” + ê” (k, A) Ağ ye?) 


(2r)? k=+|k| ` 
A=1,2 
(9.38) 
Note that the equation k? = 0 has two solutions, k? = +|k|. However, 
once we consider the general superposition of plane waves et}? and e~***, 
we can limit ourselves to k? = +|k|. For example, when k? = —|k| we 
have 
tka = —ik? x? +ik-x 
( ) 0 7 (e e 

= expf{ilk|a° +ik-x}. (9.39) 

This is the same as a term e~*** with k? = +|k|, i.e., 
exp{i|k|2° — ik-x}, (9.40) 


after renaming the integration variable k into —k. As in eq. (9.11), we 
define w from w/c = k?. Since we can restrict to k? = +|k|, we have 
w 


— = k? = +k]. (9.41) 
C 


We can then rewrite eq. (9.38) more simply as 


3 
A(t) = SES D [Bll A Ane et 4 e (k, AA et, 
N=1,2 

(9.42) 
The vectors ê(k, A), with A = 1,2, in eq. (9.42) are the two independent 
solutions of k-€(k) = 0. As a basis, we can use for instance the linear 
polarizations, or the circular polarizations; for k pointing along the z 
direction, they are given by eq. (9.21) and eq. (9.22), respectively. 


9.4 Solutions for E and B 


We can now immediately read the solutions for E and B from the so- 
lutions for the gauge potential. It is simpler to work in the Coulomb 
gauge, where @ = 0, and A is given by eq. (9.35) together with the 
condition (9.37). From eq. (3.80), and eq. (3.83) with ¢ = 0, we have 


OA 
E = —-—_ B=VxA. 9.43 
aE” x (9.43) 
We take for A a solution of the form 
A(t, x) = Ay &(k) ettik , (9.44) 
with k-ê(k) = 0. We also use k = (w/c)k, that follows from eq. (9.41). 
Then, we get 
E(t,x) = @(k) iwAy e Witte | (9.45) 


B(t,x) = [kx é(k) Ay er, (9.46) 


Defining Ek = iwA,, we can rewrite this more compactly as 


E(t,x) = 6(k) Eye irk | (9.47) 
cB(t,x) = [kx &(k)] Re ttk~, (9.48) 


As always, the actual field will be given by the real part of these complex 
expressions 


K(t,x) = Re[é(k) Ep e tte] , (9.49) 
cB(t,x) = Re [Ik x &(k)] Eye | (9.50) 


Equations (9.47) and (9.48) show the basic features of electromagnetic 
waves. Both the electric and magnetic field are transverse to the prop- 
agation direction, and satisfy 


cB(t,x) =k x E(t, x), (9.51) 


so they are orthogonal to each other, and their moduli are related by 
|E| = c|B|. The dependence on space and time is through the phase 
factor e~*¥(t) | where 


yp(t,x) = wt- kx. (9.52) 
Using eq. (9.41), we can write 
Won 
y(t, x) = —7 (kx —ct), (9.53) 


and we see that the surfaces of constant phase, i.e., the surfaces where 
y(t, x) is constant, travel at the speed of light. The quantity w is the 
frequency of the wave, since the wave is unchanged under a time trans- 
lation t > t +T with T = 2r/w, while its wavelength A is given by 


= (9.54) 


since the wave is periodic if we translate it by À along the propagation 
direction, i.e., under x > x + Ak. The vector k, with dimensions of 
inverse length, is called the wavenumber. From eqs. (9.41) and (9.54), 
we have the relation 

w 2r 


2-2, (9.55) 


It is also useful to introduce the reduced wavelength A = A/(27), so that 


X= (9.56) 


= 
a 
From eq. (3.44), and the fact that in any plane wave B = |B| and 
E =|E| are related by cB = E, it follows that the energy density u(t, x) 
is 

u(t, x) = €o|E(t,x)|?. (9.57) 
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8 We will expand on this in Section 15.2, 
where, beside this notion of “phase ve- 
locity,” we will introduce the group ve- 
locity of a wave-packet, for waves prop- 
agating in a generic medium. 
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Similarly, from eq. (3.56), the momentum density of an electromagnetic 


wave is 
g(t,x) = e  E(t,x) x B(t,x) 
= = R(t, x)|? k, (9.58) 
and the Poynting vector is given by 
S(t, x) = coc|E(t, x)|? k. (9.59) 
Note that 1 i 
g(t,x) = A u(t, x) k, (9.60) 


so the relation between the energy density and the momentum density 
of a plane electromagnetic wave is the same as the relation between 
energy and momentum for a massless particle (see eq. (3.73) with v = 
c). This fact, that will have a full explanation in a quantum theory of 
electromagnetism, can already suggest us that, at the quantum level, 
an electromagnetic wave will be a collection of massless particles, the 
photons. 

For a monochromatic electromagnetic wave, we are typically interested 
in the energy density and momentum density averaged in time over a 
period of the wave, at a fixed point in space. We denote this average by 
a bracket (...), so 


(u(t,x)) = €o (E(t, x)|’), (9.61) 
(g(t,x)) = = (E(t, x)|?) k. (9.62) 
Using eq. (9.49), 
E(t, x) = ; [ê(k) Eye tE + BF (I) Ek elt] (9.63) 
sO 
(E(t, x)|’) = Few) Ey ITI + ê*(k) Eg e*t ex] 


- [@(k) Eye wtt™ + ê* (k) Et e*t]. (9.64) 


After taking the scalar product, the terms proportional to e~?’“* and 


to et?“t average to zero, since (sin(2wt)) = 0 and (cos(2wt)) = 0, so 
only the cross terms, which are independent of time, remain. Then, for 
a monochromatic electromagnetic wave, 


(El, )P) = 31E. (9.65) 


As an alternative route, rather than working with the gauge poten- 
tials, we could have derived wave equations for E and B directly from 
Maxwell’s equations in vacuum, 


VE = 0, (9.66) 

1 JE 
VxB-3> = 0, (9.67) 
VB = 0, (9.68) 

B 
veir ei (9.69) 


We take the curl of eq. (9.69), and we use 
[V x(V XE); = €j%0;(€kimOEm) 
= (6:0jm — 5im591)0j; Em 
8;(V-E) — V°E;. (9.70) 


or, in vector form, 


V x (V x E)=V(V-E)—V°E. (9.71) 
Therefore we get 
V(V-E)-V°E+ 2v x B) =0. (9.72) 
Using eqs. (9.66) and (9.67) we then obtain 
1 oF 2 
(San +?) B=0, (9.73) 


i.e., OE = 0. Similarly, taking the curl of eq. (9.67) and using eqs. (9.68) 
and (9.69), we get 

(-S5g tv") B=0 (9.74) 

e Ot? aa i 

Note, however, that eqs. (9.73) and (9.74) are a consequence of the full 
Maxwell’s equations (9.66)-(9.69), but are not equivalent to them. The 
general solution of eqs. (9.73) and (9.74) is a superposition of plane 
waves of the form 


Bix) = Eee, 9.75) 
Bax) = Bee, 9.76) 


with w/c =k but Ex and By independent, and arbitrary. Once we put 
these solutions back into the full set of Maxwell’s equations, eq. (9.66) 
further imposes 


kh = 0, 9.77) 
while eq. (9.68) gives 
k-By = 0. 9.78) 
Finally, eq. (9.67) gives 
kxBk + Ž Ek = 0, (9.79) 
G 


which, using w/c = k, becomes 

Ex = —ck x Bx, (9.80) 
while eq. (9.69) gives the equivalent equation 

cBk =k x Ex. (9.81) 


We have therefore recovered the solution (9.47), (9.48).° 
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IIn Note 19 on page 395 we will show 
that, actually, this result can have been 
obtained using just eqs. (9.67) and 
(9.69), rather than the full set of vac- 
uum Maxwell’s equation. 
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Fig. 9.1 A linearly polarized electro- 
magnetic wave. 


Fig. 9.2 A circularly polarized elec- 
tromagnetic wave. 


9.5 Polarization of light 


Consider a monochromatic wave propagating along the z direction, so 
k = z, with Ek = E real, and consider the basis of the linear polariza- 
tions (9.21), ĉa) =X, €(2) = y. Since these are real, for a wave linearly 


polarized along the X axis eq. (9.49) gives 
E(t, x) = E cos(wt — kz) x, (9.82) 


and eq. (9.50) gives cB(t,x) = E cos(wt — kz) y. Thus, as a function of 
time (for given z), the electric field oscillates along a fixed direction in 
the (x,y) plane, which in this case is the X axis, and the magnetic field 
along the orthogonal direction in the (x,y) plane, in this case the y axis, 
and similarly as a function of z at fixed t, as illustrated in Fig. 9.1. In 
the same way, for a wave linearly polarized along the y axis, 


E(t, x) = Ecos(wt — kz)y, (9.83) 


and cB(t,x) = —Ecos(wt — kz)x. So, in these cases, the electric field 
(as well as the magnetic field) oscillates along a fixed direction. This 
is the origin of the name “linear polarization.” A combination with 
real coefficients of the solutions (9.82) and (9.83) gives a solution that 
oscillates along a generic direction in the (x,y) plane. 

Consider now the polarizations vectors (9.22). In this case, ê) = 
(x + iy)/V2. Then eq. (9.49) gives 


1 
E(t,x)= E Ja 
This represents a vector that, as t increases for fixed z, rotates counter- 
clockwise in the (a, y) plane, describing a circle; hence, the name circular 
polarization. Equivalently, as z increases for fixed t, it rotates clockwise 
in the (x,y) plane, as illustrated in Fig. 9.2. With respect to the wave 
propagating along the +z direction, this is called a right circular polar- 
ization. Similarly, with €(_) the polarization vector rotates clockwise in 
the (x,y) plane as t increases for fixed z, and describes left circular po- 
larization. The corresponding magnetic field is given by eq. (9.48), and 
rotates so as to remain orthogonal to E, in the (x,y) plane transverse 
to the propagation direction k. 
The most general case, for a monochromatic wave propagating along 
the +2 axis, is given by 


[cos(wt — kz) x + sin(wt — kz)y] . (9.84) 


E(t,x) = Re (zx + Egy )eit-ke)] (9.85) 

where FE, and EF, are arbitrary complex quantities. Writing 
E,= Aye, = Ey = Age *?, (9.86) 
with A1, Ag, 6, and d2 real, and using the notation y = wt — kz, we get 


E(t, x) = E(t, z)%+ EB, (t, 2)¥ , (9.87) 
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where 


E,(t,z) = Arcos(p+01), (9.88) 
E,(t,z) = Agcos(yt da), (9.89) 


and the dependence on (t, z) enters through y. With simple trigonome- 
try, we then find 


Ez E, 
PT sin dg — PT sind; = cosysinô, (9.90) 
Ez E 
aan — A, 8% = singysind, (9.91) 


where ô = 62 — 6,. Squaring the two terms and summing them, we get 


E,\" (EV BN E; 7 
(=) H (=) 2 (=) (Fe) cos = sin ô. (9.92) 


This is the equation of an ellipse in the (Ez, E,) plane, with semi-axes 
A, and Ag. Correspondingly, light is said to be elliptically polarized, see 
Fig. 9.3. 

For ô = 7/2 (or, more generally, for ô = ma/2 with m = +1, +38,...), 


we get : P 
Ez Ey\ 
CE E aa 


so the semi-axes are aligned with the Ey and Ey axes. If, furthermore, 
A, = Ag, the ellipse becomes a circle, and we get back circular polar- 
ization. If, instead, 6 = ma with m = 0,+1,+2,... we get 


ORORO RAS 


(with the plus sign for m odd and the minus for m even) so that 


Es EN _ 
& E a = = 
and therefore Ey o Ag (9.96) 
Es Ay’ | 


is fixed. Therefore, the electric field does not rotate in the (Ez, Ey) 
plane, and we get back linear polarization. 
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We have seen that the space and time dependence of a monochromatic 
electromagnetic wave is given by the factor e’** (and its complex con- 
jugate), where 


k” = (k°,k) = =(1,k). (9.97) 


Fig. 9.3 An elliptically polarized 
electromagnetic wave. 
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While the speed of light c is the same for all observers, the frequency w 
and the propagation direction k depend on the observer, in a way deter- 
mined by the fact that k” is a four-vector. As we will see in this section, 
this gives rise to the relativistic Doppler effect and to the aberration of 
light. 

To fix the geometry, consider a source S, such as a star, and let K, 
denote an inertial frame comoving with the star. In this frame the star 
therefore has zero velocity. We denote by (ts,x;) the coordinates of 
the source in this frame, and by ñ, the unit vector from the origin of 
this reference frame, toward the star. Consider now a second reference 
frame Kops, moving with uniform velocity v with respect to K,. In 
this frame, the star has a velocity v, = —v. We fix the origins of the 
two reference frames so that, at a given time to, they coincide, and we 
call Oops an observer that sits at the origin of the frame Kops. We 
denote by (tobs, Xobs) the coordinates of this frame, and by figps the unit 
vector from the observer to the source, in this frame, at the given time 
considered. From eqs. (7.22) and (7.23), written for a boost in a generic 
direction, the coordinates of the source in the two frames are related by 


Zos = Ys(zs +Bs'Xs), (9.98) 
Z\|,obs = Ys(T],s at Bets) j (9.99) 
Xl, obs = Xl,s, (9.100) 


where ys = 7(vs), and we have spilt the vector part of the equation into 
the components parallel and perpendicular to B,, x = x, +2 ,. Note 
that we have chosen the signs so that, if the source is at rest in the frame 
K,, it moves in the direction +8, in the frame Kobs. 

In the reference frame K,, the four-momentum of the light that prop- 
agates from the source to the origin of the coordinate system is 


ki = = (1, —ñs). (9.101) 


Note that ñ, is defined to point from the observer toward the source, 
while light propagates from the source to the observer, so k, = —ñ,. In 
the reference frame Kops, the four-momentum of the light emitted from 
the source and received at time to by the observer Oops is 


kt = = (1, —ûovs) (9.102) 


The four-momenta k“, and k¥ are related by a Lorentz transformation, 
completely analogous to eqs. (9.98)—(9.100). Splitting again the vector 
part into the components parallel and perpendicular to B, = Vs/c, we 


have 
hes = Ys (k? + Bs'ks) , (9.103) 
koos = ‘Yo (k,e + Bake) , (9.104) 
k1 obs = kis (9.105) 
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The inversion of these equations gives 


k9 = ys (keys — Bs'Kobs) , (9.106) 
ke = ‘Ye (Ki,obs — Bskops) - (9.107) 


This could be easily checked analytically, but in fact we can more simply 
observe that all relations between the quantities in the frame K, and 
the corresponding quantities in the frame K,p, can be inverted by re- 
placing B, with —G, since, if K. moves with velocity vs with respect to 
Kops, then Kops moves with velocity —v, with respect to K,. Inserting 
eqs. (9.101) and (9.102) into eq. (9.106) gives 


Ws = YsWobs (1 + B.-Nops) G (9.108) 
Therefore 
Ws 
Wo S = on a aR, 
me y (lF Otis) (9.109) 


This equation gives the frequency Wops, as measured by the inertial ob- 
server Ops for which the source has a velocity v,, as a function of the 
intrinsic frequency of the source ws (i.e., the frequency measured by an 
inertial observer for which the source is at rest), of the velocity vs of 
the source in the frame of Oops, and of the direction of the source, Nops, 
again as measured by the observer Oops. Using the explicitly expression 
for y(vs) and writing B,-Nops = Bs cos Oops (where 6, = |B,| > 0), we 
can rewrite it as 


(= 2)? 


Wobs = Ws —— 2 — 
ad * 1+ Bs cos Oops 


(9.110) 


This change of frequency between a frame comoving with the source 
and a frame where the source has non-zero velocity is called the Doppler 
effect. 

In the limit 6, < 1 we can expand eq. (9.110) in powers of 8s. The 
terms of order 8, and 82 are called the first-order and the second-order 
Doppler effect, respectively. The first-order Doppler effect is given by 


Wobs ~ Ws (1 — 8, COS Bobs) - (9.111) 


If the source is moving away from the observer, cos Oop; > 0 and then 
Wobs < Ws. One conventionally says that the frequency of light is red- 
shifted (this nomenclature, of course, had its origin in the shift of the 
frequency of visible light, but is now universally used just to mean that 
the frequency decreases). Conversely, if the source comes toward the 
observer, cos Îobs < 0 and Wops > Ws, i.e., light is “blueshifted.” 

Note, however, that for obs = 7/2, i.e., for a source moving in a 
direction orthogonal to the line of sight of the observer, there is no first- 
order effect, and 


Wobs = ws (1 — B?)1/? (cos bons = 0) . (9.112) 
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10Tn this case, of course, the role of the 
speed of light is played by the speed of 
sound (in the rest frame of the medium 
where the wave propagates). 


|vs|T 


Fig. 9.4 The geometric setting of the 
Doppler effect discussed in the text. 


This is called the transverse Doppler effect, and always corresponds to 
a redshift. 

Next consider the relation between ñ, and Nops. The relation between 
the components of ñ, and Nops parallel to 6 is obtained by inserting 
eqs. (9.101) and (9.102) into eq. (9.107). Writing 


nl = cos@s, (9.113) 
nl = COS Oobs ; (9.114) 


and using eq. (9.109), we get 


COS Oops + Bs 
cos 0, = ————_—_. 9.115 
1 + 6, cos Oops ( ) 
Again, this can be inverted by exchanging the labels “obs” and “s” and 


replacing 6, > —ßs, 


cos Âs — bs 


bo N An 
SOP Zob 1— 8, cos, (9.116) 


Equation (9.116) also implies that 


sin 0, 


eG Benny) (9.117) 


sin Oops = 


and f 
sin 0, 


Ys(cos 0s — Bs) ` 


Equation (9.116) shows that the direction in which the observers K and 
K' see the source are not the same. This phenomenon is called the 
aberration of light. 

The Doppler and aberration effects already exist in Galilean Relativ- 
ity, i.e., when the transformation between coordinates of inertial frames 
is given by eq. (7.1). For sound waves, the Doppler effect is the familiar 
change of pitch of the siren of an ambulance from when it approaches 
to when it recedes from us.1? A familiar example of aberration can be 
given by the tracks left by the rain on the window of a moving train, 
which have an inclination with respect to the vertical due to the velocity 
of the train. In both cases, these effects can be derived as a consequence 
of the non-relativistic composition law for velocities, as follows. 

For the Doppler effect consider a source that, in its rest frame, emits 
signals with a period T, and therefore a frequency ws = 27/T. As 
illustrated in Fig. 9.4, in the observer frame the source moves at velocity 
v, and, at time tı, is in the position x1. At time tg = tı + T it will be 
in the position 


tan fobs = (9.118) 


X2 = xi +Vs(t2— tı) 
= x,4+v,T, (9.119) 
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and, as we see from Fig. 9.4, to reach an observer (located at a distance 
much larger than v,7), light must travel an extra distance 


d=vsT cos 6, (9.120) 


or, in vector form, 
d = Vs'ÎopsT . (9.121) 


Therefore, the difference in the time of arrival of two signals emitted at 
times t and t+ T is 


Tobs = T+ - 
c 
= T(1+ß8ûoəs), (9.122) 
and therefore wobs = 27 /Tops is related to ws = 2r /T by 
Ws 
obs =z- za a - 9.123 
Wobs = TB di ( ) 


This agrees with eq. (9.109), except for the factor 1/7, which is a purely 
relativistic effect and is simply due to the time dilatation effect that we 
studied in Section 7.2.3: for the observer, the clock on the star goes 
slower, and therefore it emits the second signal only after a time At = 
2TYs/ws, rather than 27/w,. This reproduces the correct factor 1/ys in 
eq. (9.109). Note in particular that, in the non-relativistic computation, 
there is no transverse Doppler effect. 

The non-relativistic expression for the aberration can be computed 
similarly. Consider a frame at rest with respect to the source, in which 
light has speed c (of course, if we assume Galilean Relativity, the speed 
of light becomes frame dependent) and let 


h, = (cos 0s, sin 0s, 0), (9.124) 


be the unit vector toward the source in this frame. In this frame, the 
velocity of a light signal emitted at the source and reaching the origin 
of the reference is given by the vector 


cs = (—ccos 0s, —csin 05,0). (9.125) 


In the observer frame, where the source moves away from the observer 
with velocity vs along the positive direction of the x axis, using the 
Galilean composition of velocities we would have 


Cobs = (—ccos ĝs + Us, —csin 0s, 0). (9.126) 


Therefore, the prediction of Galilean Relativity is that light will be seen 
to arrive from an angle ops such that 
csin 6, 


tan ĝos = ——— 
ccos 6, — Us 


sin 0, 
= ——. 9.127 
cos @, — ps ( ) 
This agrees with eq. (9.118), except again for the factor 1/y;, which is 
therefore a purely relativistic effect. 


Electromagnetic field of 
moving charges 


In this chapter we study the electromagnetic fields generated by mov- 
ing charges. At the mathematical level, a fundamental tool is provided 
by the Green’s functions of the d’Alembertian operator, in particular 
the retarded Green’s function, that we introduce in Section 10.1. We 
will then be able to study the field generated by charges with arbitrary 
motion, and we will discover that, when a charge is accelerated, it pro- 
duces electromagnetic waves. We will then study in detail the radiation 
emitted in different situations. 


10.1 Advanced and retarded Green’s 
function 


To solve radiation problems we use the Green’s function method, that 
we have already introduced in Section 4.1.2 for the case of the Laplace 
operator. Here, however, the relevant operator is the d’Alembertian op- 
erator (3.88) and, as we will see, the situation is richer because it admits 
different Green’s functions. We define the Green’s functions G(x, x’) of 
the d’Alembertian as the solutions of the equation! 


2G(a;a!) =5O(r—a'), ey 
or, more explicitly 
2 
|- aye + v2 G(2°, x; x°, x’) = 6(2° —2)6@)(x—x’). (10.2) 


Once found a Green’s function, a particular solution of an equation such 
as 


f(z) = j(z) (10.3) 


is given by 


fle) = f ate’ Glai), (10.4) 


as can be checked by applying O, to both sides.2 The most general 
solution of eq. (10.3) is then obtained adding the most general solution 
from(x) of the homogeneous equation Of = 0, so that 


Oe oe J da! G(a;2")j(a!).. 


(10.5) 
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10.7 Solved problems 


Different conventions exist in the lit- 
erature for the normalization and over- 
all sign of the Green’s function. Some- 
times the Green’s functions of the 
d’Alembertian are rather defined as the 
solutions of 


Ga; x’) = 454) (x — x’), 


to reabsorbe a factor —1/(47) that we 
will find in eq. (10.24) below. Also 
notice that we are using the signature 
(—,+,+,+), so that O = —(1/c*)6? + 
V2. Sometimes the Greens function is 
defined by O,G(a;2’) = 5) (x — 2’), 
using, however, the opposite signature 
(+,-,-,-), SO =. (1/c?)0? z Ve. 
this definition then differs from ours by 
an overall sign. 


2This is valid as long as the integral 
converges. After having found the ex- 
plicit form of the Green’s function, we 
will discuss the corresponding bound- 
ary conditions that need to be imposed 


on j(x). 
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Notice that the use of different Green’s functions can be reabsorbed into 
a different choice of solution of the homogeneous equation. Indeed, let 
Gi(a;2') and G(x; x’) be two different Green’s functions, and define 


iw = / Be Coa iG); (10.6) 
falz) = jee Go(x;2')j(z’). (10.7) 

Then f2(x) = fnom(x) + fi (x), where 
honlo) = | dla! (Gama) -Gima ie). (108) 


Since 


Ja = J da! [5 (x — 2!) — 6 (2 — 2) j(2") 
= 0, (10.9) 


the solutions fə(x) and fı(x) indeed differ by a solution of the homoge- 
neous equation. Observe that, in the case of the Laplacian studied in 
Section 4.1.2, the homogeneous equation V7¢ = 0 (with the boundary 
condition that ġ vanishes at infinity, that was the physically relevant one 
in the setting of electrostatics) only has the solution ¢ = 0, and therefore 
the Green’s function was unique. In contrast, as we saw in Chapter 9, an 
equation such as Of (x) = 0 has non-vanishing solutions, corresponding 
to plane waves, that are the physically relevant solutions in a radiation 
problem. The physically correct boundary conditions must therefore be 
such to allow for the possibility of these solutions, and the choice of 
the homogeneous solution, or, equivalently, of the appropriate Green’s 
function, reflects these boundary conditions. 

We use the Green’s function technique to compute the electromagnetic 
field generated by a generic current j”. It is convenient to work in the 
Lorenz gauge 0,,A" = 0, so the equation to be solved is eq. (8.30). Given 
a Green’s function G(x; x’) of the d’Alembertian operator, a particular 
solution of the inhomogeneous equation is then 


AM) = -uo | dla! Gle, ahi), (10.10) 


subject, again, to suitable boundary conditions on j” (x), such that the 
integral converges. 

The problem, therefore, amounts to computing the Green’s functions 
of the d’Alembertian operator. Without loss of generality, we can set 
xz’ = 0 and solve the equation 


2G(x) = 6® (z). (10.11) 


A convenient way to solve this equation is to perform a Fourier transform 
only with respect to x°, writing 


+00 = 
G(2°, x) =f “ e thor" G(ko, x). (10.12) 


—oco 
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Then, eq. (10.2) becomes 


Fa dk Aa _ +00 dk m 
J <> eike? (7? + k2)G(ko, x) = / os ew tor” §(3) (x), 
T 


27 
(10.13) 
where, on the left-hand side, we inserted eq. (10.12), and, on the right- 
hand side, we used the integral representation (1.76) of ô(xo). Therefore, 
inverting the Fourier transform with respect to ko, we get 


—co =o 


(V? + k2)G(ko, x) = 6°) (x). (10.14) 


The problem is now reduced to computing the Green’s function of the 
three-dimensional operator (V° + kg), which is called the Helmholtz 
operator. To compute this Green’s function we observe that eq. (10.14) 
is invariant under rotations around the origin, where the Dirac delta sits, 
and therefore G(ko,x) depends on x only through r = |x|. We define 
f(r) from 


G(ko,r) = --Ž f(ko,"). (10.15) 


Extracting explicitly a factor 1/r is convenient because, from eq. (1.90), 
the Laplacian of 1/r produces the Dirac delta. Then [suppressing, for 
notational simplicity, the argument ko from f(ko,r)], 


1 1 
Vv’ E a) = —4n5°) (x) f(0) + =f"), (10.16) 
where f' = df /dr.° From this we get 3The explicit computation goes as fol- 
lows: 
1 
(V? + &3)G(x) = 8O F0) — U" + HBF), (10.18) v2 [Ero] = aa [=r] 

and therefore, to solve eq. (10.14), we must require f (0) = 1 and = (v=) f(r) +2 (aż) aif 
f" +kojf =0. (10.19) +v. (10.17) 
The most general solution of this equation is To compute V? f we use the expression 
for the Laplacian in spherical coordi- 
Gay Sikar nates, eq. (1.26), while, to compute the 
f(r) = Aer Be" y (10.20) term 0;(1/r)0;f, we use Oir = ni, see 


eq. (6.12), so that 
and the condition f(0) = 1 fixes A+ B = 1, so the most general solution 


of eq. (10.14) is (a) af = - 5 Clo 
by jil i i _ ly, 
Gi(ko,x) = -y [Aetor + (1 — A)e tor] | (10.21) = srs 
Tr 


since nini = 1. Then, from eq. (1.90) 

There are therefore two independent solutions, that can be taken to be 1 

ve [Ere] = -1O 0) F(0) 
> 


E il ; 
G+(k ax tikor g 10.22 2 u 1 1 d 
(ko, x) = -77€ ee) IPOH Zr tri) 
We have therefore found the Green’s functions of the Helmholtz operator = —4r6®) (x) f (0) + * f(r) : 
7 


(10.14), and we see that, for ko 4 0, there are two independent Green’s 
functions. For kọ = 0, these two solutions become identical, and reduce 
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4 Having found the explicit form of the 
Green’s functions, we can also check 
that the integrals in eqs. (10.26) and 
(10.27) both converge if, for all times t’, 
the source is localized in space, i.e., for 
all t’, j4(t’, x’) has compact support in 
x’. Less stringent conditions could also 
be used, depending on the problem. 
For instance, to have a well-defined re- 
tarded solution (10.26) at a given time 
t, it is sufficient that j“(t’,x’) is lo- 
calized in space for all times t < t. 
In practice, in most physical situations, 
j” (t', x’) has compact support in x’ for 
all values of t’, and also switches off if 
t + -00. 


to the unique Green’s function of the Laplace operator, eq. (4.15). This 
is as expected, since the Helmholtz operator reduces to the Laplace 
operator when ko = 0. Inserting eq. (10.22) into eq. (10.12) we get 


+00 
1 dko e` iko(2°Fr) 


+ 0 — _ 
Sees). = Arr J œ 20 
1 
= “eee F |x|) (10.23) 


The result for a generic second argument x’, that we had set to zero, 
can be obtained from the fact that the Green’s function G(x; x’) is ac- 
tually a function only of x — x’, because of invariance under space-time 
translation. Using furthermore ct instead of x°, we get 


1 
4r|x — x’ | 


G*(t,x;t',x') = ) = |x- x|]. 


ô [c(t 


(10.24) 
The corresponding inhomogeneous solutions for A,, from eq. (10.10) 
(using dx° = cdt and 6(ct) = (1/c)6(t) to write the final result in terms 
of t rather than x°), are 


[Art xE = 2 | ata! ES, 


1 
AT |x ol 


(tF |x —x'|/c)] . 


(10.25) 


— x’ | 


Consider first the solution [A“(t,x)]*. Performing the integral over dt! 
with the help of the Dirac delta we see that 


msg = HO f py Stale x'Ve.x!) 
[Atay = 22 fata , 


T |x — x’| 


(10.26) 


Therefore, [A“(t,x)]* depends on the value of the current j” (t', x’) only 
on the past light cone of the space-time point (t,x), i.e., on the points 
(t, x’) from which a signal, traveling at the speed of light, could arrive at 
(t,x). The Green’s function G+ (t, x; t’, x’) is called the retarded Green’s 
function. The retardation effect expresses the fact that a change in the 
source at a point x’ at time t’ cannot affect instantaneously the value of 
the field at a different point x. Rather, its effect will be felt only at a 
subsequent time t = t/+|x—x’|/c, so that t—t’ is equal to the time taken 
by light to travel from x’ to x. This is consistent with Special Relativity, 
and follows from the relativistic structure of the d’Alembertian. 
The other solution is 


[Are = 2 [aa a 


and depends on the value of the current j“(t’,x’) only on the future light 
cone. The corresponding Green’s function G7 (t,x; t,x’) is called the 
advanced Green’s function. Retarded and advanced Green’s functions 
will be also denoted as Gre, and Gaay, respectively.4 

First of all, we can check the static limit of these solutions. In the 
static limit, [A“(t,x)]* and [A“(t,x)]” become identical, since j” (t', x’) 


t+ |x —x’|/c,x’) 
Ix — x’| 


(10.27) 
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loses the dependence on the first argument, so j” (t F |x — x’|/c,x’) = 
j”(x'). Then, using j°(x’) = cp(x’), together with eqs. (8.12) and (8.27), 
we see that the u = 0 component of eq. (10.26) reduces to the static so- 
lution (4.16) for the scalar potential, while the vector part of eq. (10.26) 
reduces to the static solution (4.92) for the vector potential. 

However, whenever there is an actual time dependence, the solutions 
are different and, in this case, the advanced solution [A"(t,x)]~ is phys- 
ically unacceptable. At first, one might think that this is due to the fact 
that, since its value at time t depends on what the source will do in the 
future, this solution violates causality.” Actually, the reason why this so- 
lution is physically unacceptable is somewhat more subtle and is rather 
related to the possibility of imposing natural boundary conditions, as 
we now discuss. Using the retarded Green’s function, the most general 
solution of eq. (8.30) can be written as 


A” (t,x) = Af (t,x) — po [ata Gret (x, x )7"(2') (10.28) 
= Att (t x) + Bo fer IE x = x'|/e, x’) 
ais 4r |x — x'| i 


where Af (t,x) is a general solution of the homogeneous equation. The 
physical meaning of A} (t,x) can be seen by taking the limit t > —oo. 
In this case, also the argument t — |x —x’|/c of j” (t — |x —x’|/c, x’) goes 
to —oo and, if we assume that the source is localized in time, in this 
limit j” (t — |x — x’|/c, x’) goes to zero, for all x and x’, and the integral 
vanishes. Therefore, Af‘ (t,x) represents the initial value of A“(t,x), 
at t + —oo and x arbitrary. Notice that the same argument cannot be 
made for the limit t + +00, for all values of the arguments x of A” (t, x). 
In particular, we might wish to study the behavior of A” (t,x) as t > oo 
while r = |x| also goes to infinity, in such a way that t—r/c stays fixed at 
a given value, that we denote by tu, smaller than the time at which the 
source eventually switches off. Then, in this limit j” (t — |x — x’|/c,x’) 
goes to a non-vanishing value j"(t,, x’), so it can contribute to A“ (t,x). 

If instead we use the advanced Green’s function, we can write the 
solution as 


A(t,x) = Ab, (t,x) — uo fex Gaav(z, 2’) j"(2’) (10.29) 
— h Ho apf (t+ |x- x'|/c, x’) 
= About (ts X) + dn fa T |x — x'| , 


P (t,x) is a general solution of the homogeneous equa- 


where, again, Abut 
tion. The same argument now shows that Af (t,x) is the value of 


A" (t,x) at t > +00, for all x. We can rewrite this solution as 
Arx) = Abe(tx) — Ho f dt! Gaav(2,2!) — Gros(1,2")Li"(e! 


— Ho J dfx’ Gret(z,2')j"(a’). (10.30) 


5To be more precise, the gauge po- 
tential is not directly observable, so 
it could a priori be acausal, as long 
as the corresponding electric and mag- 
netic fields turned out to be causal. An 
example of this behavior will be dis- 
cussed in Section 11.1.2. However, for 
the gauge potential (10.27), the corre- 
sponding electric and magnetic fields 
would also depend on the behavior of 
the sources on the future light cone. 
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6One can make a parallel with the sit- 
uation in which a glass falls from a ta- 
ble and breaks into pieces on the floor, 
with its initial mechanical energy dis- 
sipated into heat, which is a form of 
radiation. The time-reversed solution, 
where radiation is focused on the pieces 
of glass scattered on the floor, in such 
a precise way that they jump back on 
the table and reassemble into a glass, 
is a mathematically legitimate solution 
but, physically, it is meaningless. 


We now define 


Ataa (tX) = —Ho Jex [Gret (2, 2’) — Gaav (x, £')]J” (2). (10.31) 


This is a solution of the homogeneous equation OA” = 0, by the same 
argument used in eqs. (10.6)—(10.9). Then, eq. (10.30) can be rewritten 
as 


AM (t,x) = AË a(t, x) — AË a(t, x) — Ho / dx! Groe(,2")j"(a"). (10.32) 


Since both A‘, (t,x) and Af q(t, x) are solutions of the homogeneous 
equation, also their difference is a solution of the homogeneous equation, 
so eq. (10.32) is of the form (10.28), with 

AË (t,x) = AB 


in out (t, x) _ Af a (t, x) . (10.33) 
So, the apparently acausal solution (10.29) has been rewritten in terms 
of the retarded Green’s function. This shows that the problem, with the 
advanced solution, is not the apparent acausality. The advanced solu- 
tion can be rewritten as an integral that depends only on the behavior 
of the source on the past light cone. The real problem is in the meaning 
of the associated homogeneous solution. As we have seen, when we use 
the retarded solution (10.28), the associated homogeneous solution is the 
value of the field at t + —oo. It is easy to specify a physically meaningful 
expression for Af (t,x). For instance, setting Af (t,x) = 0 describes a 
situation where, at t ++ —oo, there was no incoming radiation. A system 
of charges, accelerated by their mutual interactions, will then produce 
an outgoing radiation that can be computed by setting A} (t,x) = 0 in 
eq. (10.28). In contrast, if we write the solution in the form (10.29), 
we must specify the function Af (t,x), which has the meaning of the 
limit of the solution A# (t,x) for t > +oo. First of all, this is not what 
we typically want to do. In general, we want to specify initial condi- 
tions and see how a system evolves, rather than specifying the desired 
final outcome of the evolution. Furthermore, there is no way of specify- 
ing meaningful final conditions. For instance, a mathematically simple 
choice such as Af (t,x) = 0 corresponds, physically, to a situation in 
which, at t + —oo, there was radiation coming from spatial infinity and 
impinging on a system of charges, perfectly tuned so that the charges of 
the system, accelerated both by their mutual interactions and by the in- 
coming electromagnetic wave, emit outgoing electromagnetic waves that 
perfectly cancel among each other, leaving a total vanishing outgoing ra- 
diation field. Such an initial condition is acceptable mathematically, but 
not physically. A physically meaningful solution can only be specified 
writing the solution in the form (10.28), using the retarded Green’s func- 
tion, and specifying the initial field A} (t,x) in a way that corresponds 
to realistic settings, such as no incoming radiation, i.e., AŻ (t,x) = 0, or 
any other physically realistic choice, such as a given laser pulse arriving 
on a system of charges. 
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We therefore write the physically relevant solution for A” as 


t — |x — x’|/c,x’) 


Ix — x’| 


a 
AH (t,x) = AË (t,x) + 5 fèr ph , | (10.34) 
T 


where Af (t,x) is a solution of the homogeneous equation describing the 
incoming field. If we are only interested in the field generated by the 
source, we can simply set Af (t,x) = 0.7 

Observe that eq. (10.11) is explicitly Lorentz invariant, because both 
the O operator and 5“ (x) are Lorentz invariant.® In the form (10.24) it 
is not evident how Gt and G~ behave under Lorentz transformations, 
but in fact they are invariant. This can be seen using the property (8.7) 
of the delta function, that implies that 


d(x?) = 6[(2°)? — |x|?] 
= a ote" ~ |x|) + (9 + ixi: (10.36) 


The expression in the last line is not yet the combination that appears in 
the retarded or advanced Green’s functions. However, we can multiply 
this by a theta function, defined in eq. (1.67), to obtain? 


O(S?) = gg — be (10.38) 
6(—2°)5(27) = yale? + bed (10.39) 


In general, the sign of x° is not invariant under Lorentz transformations. 
However, if x° > 0 and x? = 0, the event (x°, x) is on the light-cone of the 
event (x° = 0,x = 0). Then, x° will remain positive under any Lorentz 
boost (since the velocity vo of the boost is always restricted to be strictly 
smaller than c). This can be seen from the Lorentz transformation 
(7.22). Setting for definiteness x = (x, 0,0), the condition (x°)? —x? = 0 
gives x = +2x° and, under a Lorentz boost, x°? > x’ = y(x9+6zx) . Since 
|z| = x? and |8| < 1, x’? remains positive. We can also see it graphically 
from Fig. 7.1 on page 161, where the events in the future of the boosted 
observer are given by the part of the (x°,2) plane that lies above the 
line t? = 0, and contains all points with 2° > 0 which lie on the light 
cone of the observer at the origin. 

Therefore, the combinations 6(x°)6(x?) and, similarly, 6(—x°)6(«?), 
are explicitly Lorentz invariant. In terms of them, the advanced and 
retarded Green’s functions (10.23) can be written as 


1 
G* (x) = ——0(t2")6(a*), (10.40) 
27 
or, reinstating the second argument, 
+;... 1 0 10 1\2 
G7 (52!) = — 5 Ee — 2" alle- a). (10.41) 


TAs a byproduct of this analysis ob- 
serve, from eq. (10.33), that 


AM alt: x) = A‘, (t,x) — AB (t,x), 

(10.35) 
is the difference between the outgoing 
and the incoming field, and therefore 
can be interpreted as the radiation gen- 
erated by the system. We see, from 
eq. (10.31), that it is obtained from the 
combination G4 = Gret — Gaav. We 
will find this combination again in Sec- 
tion 12.3.5, when we will discuss radia- 
tion reaction. 


8 As we have already seen in eqs. (8.4) 
and (8.5), the invariance of 5() (x) fol- 
lows from the fact that f d+ad (x) = 
1, together with the fact that, under a 
Lorentz transformation 7! > AF 2x”, 
dtz — (det A) dtz and det A = 1, so 
dtz is Lorentz invariant. 


°We are using the fact that, for |x| Æ 0, 
ô(x? — |x|) has its support at x? strictly 
positive, where 0(a°) = 1, so 


6(x°)5(x° — |x|) = 6(a°—|x|), (10.37) 


and similarly 0(@°)5(2° + |x|) = 0. All 
these relations, however, become ill- 
defined at |x| = 0 or at z? = 0, even 
in the sense of distributions, and can 
give ambiguous results. For instance, 
when x = 0, multiplying both sides of 
eq. (10.37) by a regular function f(x?) 
and integrating, the left-hand side gives 


co 
f da? 6(2°)5(x°) f (2°) 
—co 
co 
= f” ae 50°) F@°), 
0 

and this is an ill-defined integral, since 
x? = 0 is at border of the integra- 
tion domain, and the result depends 
on the sequence of functions chosen 
to define the Dirac delta. In con- 
trast, on the right-hand side we get 
J dx? 5(x°) f (x°), which is always 
well-defined and equal to f(0). There- 
fore, care must be taken in some situ- 
ation when using the explicitly covari- 
ant form of the Green’s functions. In 
Section 12.3.5 we will provide a care- 
ful treatment of these covariant expres- 
sions and of their derivatives. 
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10.2 The Liénard—Wiechert potentials 


We now compute the fields generated by a point charge q moving with 
arbitrary velocity. The corresponding charge and current densities are 
given by eqs. (8.1) and (8.2) or, in covariant notation, by eq. (8.3). We 
perform first the computation for the potential A? = ¢/c. Plugging 
eq. (8.1) into eq. (10.25) and using the sign corresponding to retarded 
Green’s function, we get 


(3) figel 1 osi 


Arreg x— x’ | c 
(10.42) 
We denote the trajectory of the particle by xo(t), and its velocity by 
dxo(t) 
i) = i 10.4 
v(t) = = (10.43) 


We are interested in the field generated by the charge itself, so we have 
set to zero the solution of the homogeneous equation. Rather than 
performing first the integral over dt’ to reach the form (10.26), for a 
point charge it is convenient to carry out first the integral over dx’ 
with the help of 6()[x’ — xo(t’)]. This gives 


— 4 , 1 , |x — xo(t')| 
ee je tl t+ ) (10.44) 


AT €9 c 


We now define retarded time tret as the solution of the equation 


|x — Xo(tret)| 
c 


tret + =t, (10.45) 


so the Dirac delta in eq. (10.44) is satisfied for t = tre. Note that 
eq. (10.45) is an implicit definition of tret as a function of t and x, 


tret = tret (t, X) . (10.46) 


Retarded time has a clear physical meaning. If we imagine that the 
charge continuously emits light signals toward an observer in x, then 
the signal that reaches the observer at time t was emitted by the charge 
at an earlier time tret, when the charge was at the position Xo(tret), such 
that the observation time t is equal to the emission time tret plus the 
time |x—Xo(tret)|/c taken by the light signal to reach x from the position 
Xo(tret)- Also note that, for the observer in x, Xo(tret) is the apparent 
position of the charge at time t. It is useful to define 


R(t, x) = x — xo(t), (10.47) 


and R = |R], so that eq. (10.45) reads 


1 
tret + PR ltret; x) =t. (10.48) 
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To compute the integral over t’ in eq. (10.44) we define 
_ k= xo(#)| 
c 


, (10.49) 


ft) = ¢-t4 
R(t’) 


= ť-t+ 


[where, for notational simplicity, in the intermediate steps we omit the 
argument x in R(t’, x)|], and we observe that the equation f(t’) = 0 has 
just a single solution at t’ = tret.!° Therefore 


1 


o[f(t)] = O(t’ — tret) . 10.50 
From eq. (10.49), 
df 1 dR(t') 
= : 10.51 
dt! e dt oe) 
We now use?! dR(t) i 
= NR). 10.52 
ROR) (10.52) 
Then eq. (10.44) gives (reinstating the argument x in R) 
q / 1 1 1 
ot, x A I 7 ô(t tre ) 
os Are R(t, x) 1 —— z LR, x) t 
1 
2 (10.53) 


ÅTEo [R t.x)— vt) R, t! 
Nee ON ak 5 tg 
From eqs. (8.2) and (10.25), exactly the same computation gives A, with 
qv (tret) instead of q at the numerator, and fio instead of 1/eo in front. 
In conclusion, the potentials generated by a charge on an arbitrary 
trajectory x9(t) and velocity v(t) = dxo/dt, are given by 


1 q 
(t,x) Tee (aon) (10.54) 
and 
— HO gy 
A(t, x) a = v ) ; (10.55) 


where the subscript “ret” in these expressions indicates that, to get 
the potentials at time t and position x, the right-hand sides must be 
evaluated at the retarded time tyet (t,x), defined by eq. (10.45), i.e., in 
eqs. (10.54) and (10.55), R = R(tret (t,x), x). These are the Liénard- 
Wiechert potentials.'” Also note that the velocity that enters in eq. (10.53) 
is 


dxo(t’) 
dt! t’/=tret (t,x) 

dxo (tret) 
dtret 


V(t’) [et trot (t,x) 


, (10.56) 
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10 The fact that t/ = tret is a solution 
of f(t’) = 0 follows from the definition 
(10.45) of tret. This solution is unique 
because, for a physically acceptable tra- 
jectory xo(t), such that |dxo/dt| < c, 
at fixed x retarded time tret(t, x) is a 
monotonically increasing function of t. 


11 The explicit computation goes as fol- 
lows: 
dR(t' 1 d 
R( ) = R? (t') 
dt! 2R(t’) dt’ 
1 d 
E xot’) 
2R(t') dt’ 
1 


~ 2R(t/) 
a f / 
= -ggj ORO). 


p? 


2x-xo(t')] 


2x0 (t’)-v(t') — 2x-v(t’)] 


12 Found by Alfred-Marie Liénard in 
1898 and, with an independent method, 
by Emil Wiechert in 1900. 
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i.e., is the velocity computed with respect to the natural time variable, 
tret, of an observer located at the position of the charge, rather than 
with respect to the time t of the distant observer that measures the field 
produced. 

It can be useful to define also the quantity R,(t,x) from 


Ra(t,x) = X — Xo|tret (t, x)|, (10.57) 
so that, comparing with eq. (10.47), 
Ra(t, x) = R(tret (t,x), x) - (10.58) 


Note that -Ra = Xo(tret) — x is the apparent position of the source, 
with respect to an observer in x (the subscript “a” in R, indeed stands 
for “apparent”). It depends on x both explicitly, and through the de- 
pendence of tye, on x. We also define the retarded velocity v,(t,x) as 


dxo(t’) 
Vr (t, x) = dt! t! =tret (t,x) 
dX (tret) 
n 10. 
Tex (10.59) 
Therefore, eqs. (10.54) and (10.55) can be written as 
1 q 
t = 
ae Blea) ee 1000) 
and 
A(t,x) = £ aes) (10.61) 
4r Ra(t,x) — vr(t,x)-Ra(t,x)/c 


Observe that, in the Liénard—Wiechert potentials, the gauge potentials 
are related by 


vr(t,x) 
2 


A(t,x) = o(t, x). (10.62) 


It can be useful to write Ra in terms of its modulus Ra and the unit 
vector Ra = Ra/Ra. Then, in particular, eq. (10.60) reads 


1 1 
q : l (10.63) 


p(t, x) F ÅTEo Ralt, x) 1— vr(t, x)Ra(t, x)/c 


If the terms in brackets were not there, this would just be a “retarded 
Coulomb potential,” in which the instantaneous position of the source is 
replaced by its retarded position. The emergence of the extra velocity- 
dependent term in brackets, that came out from the computation, can 
also be understood considering the simpler case of a particle moving 
with constant speed, and performing a Lorentz boost of the Coulomb 
potential, as we discuss in Section 10.3. 
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10.3 Fields of charge in uniform motion 


In the simple case of a charge moving with constant speed, the general 
formalism based on the Liénard—Wiechert potentials is not really neces- 
sary. To compute the field generated by this source it is much simpler to 
observe that the result can be obtained by performing a Lorentz boost 
from the inertial frame where the charge is at rest to the inertial frame 
where it has speed v = vx. Consider first a frame K, (the “source” 
frame) where the charge is at rest at the origin, and denote the coordi- 
nates in this frame by (ts,Xs). The potentials s and A, in this frame 
are given by 


1 q 
s(ts, s = ) : 
lO) ng Jarre OOM 
A,(ts,Xs) = 0. (10.65) 


Let K be a frame, with coordinates (t,x), such that, at t = 0, the 
charge is at the origin with velocity v = vx. The relation between the 
coordinates of the two frames are given by eqs. (7.24) and (7.25), that 
we rewrite here in the notation 


v 
fy (i ý 5e) ! (10.66) 
r = Y(£s+vts). (10.67) 


These relations can be inverted to give 


U 
ts + (i z 5) , (10.68) 
zs = ylz-—vt), (10.69) 


while y = ys and z = zs. So, in particular, 
ri +y tz = lasat) Hy +27. (10.70) 


Similarly, since A° = ¢/c and A are the components of a four-vector, 
the potentials transform as 


$ 
A; 


Y [Ps + v(Az)s] , (10.71) 
y [(A2). + 3s ; (10.72) 


while A, = (Ay)s and A, = (Az)s. We use the label “s” for the source 
frame, and we reserve ġ and A for the fields in the observer frame. Note 
that these are the transformations of the potentials at the same space- 
time point P, whose coordinates will have different numerical values in 
different frames. Since A, = 0, for (t,x) we get 


v(t, x) = 1s (ts; Xa) 


1 
= i (10.73) 


Amey \/x2 + y2 + 22 
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13The plus sign in front of the square 
root is fixed by the fact that, for v 
0, this expression reduces to t — tret 
tr/c. 


14 Explicitly, 
v v 
—-R (tret) = -x-R(tret) 
C C 
v 
=t ipeo] 
e 


(x — vtret) 


aleale 


[(a — vt) + v(t — tret )] 


ale 


2 
(a — vt) + T Rlteet) . 


We now make use of the fact that (t,x) and (t,,x,) refer to the same 
space-time point seen in two different Lorentz frame, so are related by 
eqs. (10.68) and (10.69), and therefore by eq. (10.70). We then obtain 


o(t,x) = = l 
dren y(x — vt)? +y +2? (10.74) 
Similarly, 
v 
As(t,x) = aes 
_ 1 qyv 
Ameo? yayta 
Ho qyv 
= in so a (10.75) 
V Ts F Ys T Z$ 
while Ay = A; = 0. Since v = vx, in vector form we have 
Ho qyv 
A(t,x) = : 
( a) 4T yy (z = ut)? + y? +22 (10.76) 


It is instructive to rederive these results using the Liénard-Wiechert po- 
tentials. We therefore set x(t) = vt, with v = vx, in eqs. (10.54) and 
(10.55). For a generic trajectory, the main technical difficulty, when ap- 
plying the general formalism based on the Liénard—Wiechert potentials, 
is to solve eq. (10.45) to get tre, as a function of t and x. However, for 
a uniform motion, this can be performed analytically. Equation (10.45) 
in this case gives 


c(t — bea) = |x- vtretX|? 
= (—vtret)? +y +27. (10.77) 


This is a second-order equation in treț, or equivalently in (t— tret), whose 
solution can be written as!% 
v 
c(t — tret) = yz — vt) + yy 72 (z — vt)? +y2 +22. (10.78) 
Note that, from eq. (10.48), R(tret) = c(t — tret) and therefore, for uni- 
form motion, 


R(tret) = V (E — vt) tyy PE vt)? FP +2, (10.79) 
c 
while!4 
v v v? 
Therefore 
v v? v 
R(tret) _ T Rltet) = (1 = =) R(tret) = AG = vt) 
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where, in the second line, we used eq. (10.79). Then, eq. (10.54) gives 


1 14 
t,x) = 10.82 
o( ) Arreg V7? (a — vt)? + y? + 2? ( ) 


and eq. (10.55) gives 


Ho qyv 
A(t, x) = ; 10.83 
9) An \/7?(a — vt)? + y? + 2? ( ) 
We have therefore recovered eqs. (10.74) and (10.76). 
We can now compute the corresponding expressions for the electric 
and magnetic fields. Using eq. (3.83), we get 


X — Xo (t) 


q 
E(t, x) = Y , 
Ameo P= e 
where xo(t) = (vt,0,0) is the position of the charge at time t. We can 
rewrite the result using R(t, x) = x — xo(t), see eq. (10.47), which, for 
the case of uniform motion, becomes R(t, x) = x — vt. We also define 6 
as the angle between R and V, so that 


x — x(t) = R(t,x) cosé, (10.85) 


(10.84) 


and 
yY +2? = R?(t,x) sin? 0. (10.86) 
Then eq. (10.84) can be rewritten as 


E(t,x) = 2 7 R(t, x) 
i 4reo [1 + (72 — 1) cos? 0]3/2 R2(t,x) ’ 


(10.87) 


or, equivalently, writing cos? @ = 1 — sin? @ and using the explicit expres- 
sion of y in terms of v, as 


EE: 1—v?/c? R(t, x) 
E(t, x) = ATE [1 = (v2 /c?) sin? 93/2 R?(t,x) = (10.88) 


Similarly, using eq. (3.80), we get 


B(t, x) = av x E(t, x). (10.89) 


Several aspects of this result are noteworthy: 


(1) The electric field is directed radially with respect to the instan- 
taneous position of the charge, i.e., is in the direction of R = 
x — xo(t). The fact that the field at time t depends on the in- 
stantaneous position xo(t) of the charge at the same time t, rather 
than on the position at retarded time, is not a sign of action at 
a distance. Simply, having specified that the motion is at con- 
stant speed for all times, the future position of the charge can be 
perfectly predicted or, in other words, the position of the charge 
at retarded time tret determines the position of the charge at the 
subsequent time t. 
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(2) 


The modulus of the electric field is not spherically symmetric: in 
particular, if, in eq. (10.87) we set 0 = 0, i.e., y = z = 0 and 
x — xo(t) = R, the modulus of the electric field becomes 


q 1 

= 0=0 10.90 
ap C=, (10.90) 
so the field in the direction of motion is reduced by a factor 1/7? 
compared to the Coulomb field of a charge at rest. On the other 
hand, if we set 0 = 7/2, i.e., y? + 2? = R? and x — zo(t) = 0, we 
get 


rn T = 
-ie =>. (10.91) 


so the field at a right angle with respect to the direction of motion 
is enhanced by a factor y compared to the Coulomb field. 


The lines of B circulate around the direction of motion, as for a 
steady current. The modulus of B is stronger at a right angle to 
the motion and is reduced as we approach the direction of motion, 
both because of the behavior of E, and because of the sin 0 factor 
coming from the vector product v x E, 


1 
B = +z vEsiné 

c 
Ho yqu sind 

= 10.92 
An R? [1+ (72 — 1) cos? 6]3/2 ’ ( ) 

or, equivalently, 

Ho qu. (1—v?/c?)sin@ (10.93) 


an 4r R? [1 — (v2/2) sin? 0]3/2 ` 
In particular, B vanishes at 0 = 0, i.e., on the direction of motion. 
In the limit of an ultra-relativistic charge, y >> 1, the electric and 
magnetic fields are concentrated within a small angle 60 around 
0 = 7/2. Indeed, the limit y >> 1 at fixed 0 has two regimes. If 6 
is fixed and different from +7/2, so that cos@ 4 0, eventually in 
the limit y — 00 also y? cos? 0 > 1, and 


y 1 
=O|(-—]. 10.94 
Tora iapa = O (3a) 7 
Correspondingly, for the modulus of the electric field we have 
1 q 1 
B= — x O| = 10.95 
Arey R? x (=) í ( ) 


which is smaller than the Coulomb potential of a non-relativistic 
particle by a factor y?. Consider, however, the situation in which 
we send at the same time y > œ and @ > 7/2. Writing 0 = 
(1/2) + 66, to first order we have cos@ ~ —00, and, for y —> oo 
and 66 > 0, 


y z Y 
[1 + (7? — 1) cos? 0]3/2 [1 + (750)?]8/2 ` 


(10.96) 
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If the limit y — œ and 60 — 0 is taken so that 760 stays finite, say 
7600S O(1), then, rather than reproducing the behavior (10.94), 
this expression grows as y, and 


Be #2) oul: 


= ieee 10. 
4reo R? (10a) 


This is larger by a factor O(y) compared to the Coulomb poten- 
tial of a non-relativistic particle. Therefore, there are two regimes, 
separated by the region yô0 = O(1). For 766 >> 1 we are in the 
regime (10.95) and F is very small (compared to the Coulomb po- 
tential of a static charge), while, for yô0 S O(1), E is very large. 
The electric field of an ultra-relativistic particle is therefore fo- 
cused in a small region, with opening angle 60 ~ 1/7, in the plane 
transverse to the velocity of the charge. 


10.4 Radiation field from accelerated 
charges 


The fields generated by a charge in arbitrary motion can now in principle 
be obtained inserting eqs. (10.60) and (10.61) into the expression of E 
and B in terms of the gauge potentials, eqs. (3.83) and (3.80). The com- 
putation, however, still involves some subtleties; the main point, when 
taking the derivatives with respect to t and x of the gauge potentials, 
is to correctly account for the dependence on t and on x that enters 
implicitly through tret(t, x). To this purpose, we first compute Otyet, /Ot 
(at constant x) by differentiating eq. (10.48). We use Rq(t, x) defined 
in eq. (10.57), so that R(tret, x) = Ra(t,x). Then 


trt 10 
ZL R(t, x) =1. 10. 
At ree (t, x) (10.98) 
We next observe that! 
ð P Stret 
— R, (t,x) = -Êa (t, x)-v,(t , 10. 
È Ra(t,x) = -Êalt, 1) :v (0) (10.99) 
Inserting this into eq. (10.98) and solving for tret /ðt we get 
Otre 1 
t- £ : (10.100) 
Ot 1— R(t, x)-v,(t)/c 
We proceed in the same way to compute Ojtret(t,x). Differentiating 
eq. (10.48) with respect to 0;, at constant t, we get?® 
—cðitret = (Ra); — Ra-vr(t)Oitret ; (10.101) 
and therefore š 
1 Ra 
Viret = (10.102) 


c 1— Ravr(t)/c 


15 Explicitly, 
a 1 ðv 
£ Ra(t,x) = —— — R(t, 
gS a a eE 


= 1 ð 
~ 2Ra(t,x) at! 
1 ð 
2Ra(t, x) ðt 
x [x? + x (tret) — 2x-xo (tret )] 
x — X0 (tret) dXo (tret) 
Ra(t,x) dt 
dxo (tret) Otret 
trot Ot 
au) 


x — XO (tret) |? 


= —R, (t,x) 


Otret 
t'=tret at 


where, in the last line, we used the def- 
inition (10.59) of vr (t). 


16-This is obtained similarly, writing 


—cOjtret = Oi Ra 

_ 1 

~ 2Ra(t,x) 
1 


~ 2Ra(t,x) ;{x? + xp [tret (t, x)] 


0; |x — xo[tret (t, x)] |? 


—2x-xo [tret (t, x)] } 
dx? (tret) 


Oxt 
dtret itret 


1 
= 2x; + 
2Ra(t, x) { 5 
d tre 

—2[xo (tret )]i = ax Toet ot} 

[xo (tret) — x]-vr(t) 
Ra(t,x) 

= (Ra): = Ra -vr (t)ðitret š 


Ojtret 


aa (Ra); t 
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Using eqs. (10.100) and (10.102), one can explicitly compute all spatial 
and temporal derivatives of the quantities R,(t,x) and v,(t,x) that ap- 
pear in eqs. (10.60) and (10.61) and obtain E and B from eqs. (3.80) and 
(3.83). The rest of the computation is long, but in principle straightfor- 
ward. The result for the electric field can be written as 


E(t, x) = E, (t,x) + Eraa(t, x). (10.103) 


The term E, depends on the retarded position and velocity of the charge 
(the subscript v indeed stands for “velocity” ), and is given by 


. 2 q Ra — vr/c 
Ameo R? y2(1—Ra-vr/c)?’ 


E, (t,x) (10.104) 


where, for notational simplicity, here and in the following equations, we 
do not explicitly write that R, and R. are actually functions of (t,x), 
and that v, is a function of t. 

The term E,q, in contrast, depends on the retarded position, velocity, 
and acceleration of the charge, and is given by 


1 q [Wr x (Ra —v;/c)] x Ra 
Ey», t, = “A , 
a(t, x) rak: el-e (10.105) 
where 

. dv ,(t’) 

Vr = dt! [Vte (10.106) 
d?’xXoltret) 

= Pea (10.107) 


and, in the second line, we used eq. (10.59). The subscript “rad” in Esaa 
stands for “radiation,” for reasons that we will discuss below. The result 
for the magnetic field can be written as 


Ta 
B(t,x) = “Ra x E(t, x). (10.108) 


Therefore, it can also be split into two terms, 
B(t, x) = By (t,x) + Braa (t, x), (10.109) 


where 1 
B, (t,x) = —R, x E, (t,x) (10.110) 
C 


depends only on retarded position and velocity, while 


la 
Braa(t, x) = ~Ra x Eyaa(t, x) (10.111) 
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depends also on the retarded acceleration. Performing explicitly the 
vector products, and using Ho instead of €o, we get 


B (t,x) =£ S v, x Ra 
ve 4n Ra PAL- Ravno)?’ 


(10.112) 


and 


Ho q (vx Ra) (vr Ra) +c(1— Ra-v,/c)vrxRa 


Braalts®) = Te Te ee 


(10.113) 
Note that the first term in the numerator of eq. (10.113) depends on 
the component of the acceleration parallel to Ra, while the second on 
the component transverse to Ra. For v(t) = v, constant, E,.q vanishes, 
while E, reduces to the result given in eq. (10.84).1” 

A crucial difference between the two terms in eq. (10.103) is their 
behavior at large distances from the source. If Ra > œ, E, decays 
as 1/R?, just as the Coulomb field, to which it reduces in the non- 
relativistic limit. From eq. (10.110), the same 1/R? behavior at large 
R, then holds for B,. Therefore, in the absence of acceleration, the 
Poynting vector (3.34) decays as 1/R4. The total flux radiated at large 
distances is given by the right-hand side of eq. (3.35); since ds = R?dQ 
and |S| ~ 1/R#, the flux at infinity vanishes. This part of the field is 
therefore known as the non-radiative part (or the induction part). In 
contrast, Esaa and Braa decay only as 1/R, at large distances. Their 
combined contribution to the Poynting vector then goes as 1/R?, and 
the flux at infinity is non-vanishing. This means that energy is radiated 
away at infinity, and this part of the field is called radiative. We will 
compute the radiated power, in a full relativistic setting, in Section 10.6, 
after having first studied the non-relativistic limit in Section 10.5, and 
we will then discuss the radiation field in greater detail in Chapter 11. 

From eq. (10.111) we see that Brag is orthogonal to both Ersaq and 
Ra. Similarly, because of eq. (10.110), B, is orthogonal to both E, and 
Ra. However, for the radiative field, we see from eq. (10.105) that even 
Eaa is orthogonal to R,. Therefore, in the radiative part of the elec- 
tromagnetic field, E, B, and R, form an orthogonal system of vectors, 
and eq. (10.111) then implies 

c|Byaa| = |Eyaa| : (10.115) 
In contrast, we see from eq. (10.104) that E, is not orthogonal to Ra 
(actually, in the non-relativistic limit, it becomes exactly parallel to Ra, 
and reduces to the radial Coulomb field), and then eq. (10.110) implies 
that 


c\By| < |E|, (10.116) 


with B, vanishing in the limit v, — 0, as we see from eq. (10.112). 


7 This can be shown using eq. (10.81) 
to write the denominator in eq. (10.104) 
in the form given in eq. (10.84) while, 
for the numerator, we observe that, for 
constant v, 

Ra — Ravr/c 
xo(tret ) — c(t — tret )v/c 
=x— [xo (tret) + v(t = tret )] 
(10.114) 


=X 


=x— xo(t) , 
where, for constant v, we have used 
Ra(t) = R(tret) = c(t — tret) from 
eq. (10.79). 
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10.5 Radiation from non-relativistic 
charges. Larmor formula 


Now consider the field generated by a non-relativistic particle that moves 
in a bounded region of space. In this case, in eq. (10.57), |xXo(tret)| < d 
for some length-scale d so, in the limit |x| > oo, also |Ra| —> oo. Then, 
from eqs. (10.103)—(10.105), to lowest order in v/c and lowest order in 
1/Ra the electric field becomes 


E(t,x) ~ Eyaa(t,x) 
1 q 
4reoc? Ra 


[a(tret) x Ra] x Ra, (10.117) 


where we have written the retarded acceleration v, as a(tret). 
We now observe that, given any vector V and any unit vector n, we 
can always write 


V = A(ûV)+[V-— ûâ(û-V)] 
= Vâ) +V (û), (10.118) 
where 
V\(a) = (ñV), (10.119) 
Vi(a) = V-Aa(av). (10.120) 


The vector V|(n) is the projection of V in the direction of ñ, while 
V (ñ) is transverse to n, since 


a[V—n(a-V)]) = (8 V)—-(-V) 
= 0, (10.121) 
Using eq. (1.9) we see that we can also rewrite V (ñ) in the form 
V_ (a) = —nx(nxV), (10.122) 


or, equivalently, 


V_ (a) = —(Vxa)xa. (10.123) 


We then decompose a(tret) into its parts orthogonal and parallel with 
respect to Ra, 
A(tret) = a (tret) + ay (tret) - (10.124) 


In eq. (10.117) the term aj(tret) gives a vanishing contribution when 
taking the vector product with Ra, while, from eq. (10.123), 


[as (tect) eH: x Ra = -a (tet). (10.125) 


Therefore, to lowest order in v/c and lowest order in 1/Ra, 


1 q 


E(t, x) > 
Os Aregc? Ra(t) 


a, |t- Ra(t)/d. (10.126) 
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Note that gxo(t) is the dipole moment d(t) of the charge, so eq. (10.126) 
can be rewritten as 

1 1 
4reoc? Ra(t) 


E(t, x) ~ dı [t — Ra(t)/c]. (10.127) 
We see that, to lowest order in v/c, the radiation field is generated by 
the dipole moment of the charge.'® 

Two main features of this result are: 


(1) As we already remarked, at large distances the field generated by 
an accelerated charge decays as 1/R,. It is therefore a radiative 
field, contrary to the Coulomb field of a static charge, or of a charge 
in uniform motion, that decays as 1/R?. 


(2) For a non-relativistic charge moving in a bounded region, |xo(t)| < 
d, the electric field at a point x, with |x| >> d, is proportional (and 
opposite) to the component of the acceleration (computed at re- 
tarded time) transverse to the line of sight from the observer at 
x to the apparent position of the charge. In particular, a charge 
accelerating in straight line does not emit radiation in that direc- 
tion. More generally, a non-relativistic charge does not radiate in 
the direction of its apparent acceleration. 


We also observe that, in the limit |x| > d, 


Ra(t,x) = xX—Xoltret(t,x)] 
x x —xo(t) 
= R(t,x), (10.128) 


since the distance |x—xo(¢)| is very large compared to d, while |xo(t)| and 
|xo[tret(t, X)]| are both smaller than d. Therefore, in the limit of large 
distances from the source, to leading order eq. (10.126) is equivalent to 


1 q 


E(t fas 
a 4reoc? R(t) 


ailt- R®/qd, (10.129) 


or, more explicitly, 


n A E 2 soa) . (10.130) 


X t— 
Aregc? |x — xo(t)| as ( c 


Note that R(t, x) = |x—xo(t)| is the distance between the point x where 
we compute the fields and the instantaneous position of the source. 

Actually, in the limit in which r = |x| > d, to lowest order we can 
simply neglect altogether the term xo(t) in R(t) and replace R(t) with r, 
which is the distance of the point x to a fixed origin, that it is convenient 
to choose inside the region where the motion of the source is localized. 
Then, we can write simply 


Hejas ath (10.131) 


18 As we discussed below eq. (6.21), the 
value of the dipole moment of a sys- 
tem with non-vanishing total charge 
changes if we shift the origin of the co- 
ordinate system: if xo(t) > xo(t) + s, 
with s a constant vector, then d —> 
d+qs. However, the extra term is con- 
stant in time, and does not affect d. 
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or 


d,(t—r/c). (10.132) 


E(t,x) = -7 lät - r/o). (10.133) 


In Section 11.2 we will see in detail how eq. (10.132) emerges as the 
lowest-order term in a systematic expansion in v/c, with higher-order 
terms parametrized by time derivatives of higher and higher multipole 
moments. 

The notation d, is useful to stress that, physically, only the com- 
ponent of d transverse to the line of sight contributes. For explicit 
computations, however, it can be more useful to leave it in the original 
form (10.117), in terms of a triple vector product. In the approximation 
in which we replace R, by x = rn, from eq. (10.122) we have 


dı = —û x (Ax d). (10.134) 


Then, we can rewrite eq. (10.127) as 


Z fx [Â x d(tret)], (10.135) 


where, in the same approximation, tret = t — (r/c). Another useful form 
of this result is obtained using the identity (1.9), 


i | 


= {d(tret) = [ñ-ältre lâ} . (10.136) 


Diaa ATO r 


From eq. (10.135) we also see that the modulus of the electric field can 
be written as 


1 2 
L | x d(tret)] - (10.137) 


E| > — —, 
IE | A4negc? r 


For the magnetic field, to leading order we get 


1 1 Pa 
= Ax d(tret), (10.138) 


ATE r 


B(t,x) > 


where we used eq. (10.108) (with Ra replaced by n) and eq. (10.135), 
and we expanded the triple vector product using eq. (1.9). Using Ho 
instead of €o in the expression for the magnetic field, 


m E 
B(t,x) ~ E = Â x d(tret) - (10.139) 
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We next compute the power radiated. Using eq. (10.108), with R, 
replaced by n, for the Poynting vector (3.34) we get 


S = np n. 
Moc 

In eq. (3.35) we saw that the energy flowing per unit time out of a 
volume V is given by f. av ds'S (note that the energy flowing out of the 
volume is given by this expression with the plus sign so that, when 
Tage ds-S is positive, according to eq. (3.35) the energy inside the volume 
V decreases). For the volume V we take a sphere of large radius r, so 
ds = r7dQ.n. Since the energy flowing out per unit time is the power P 
radiated, we see that the power dP radiated at time t through a surface 
at distance r from the source and within an infinitesimal solid angle dQ 
is 


(10.140) 


i ge i 
oc (4r)? r? 


dP(t;0,¢) = IÂ x A(tret)|? r2dQ 


Ho A ey 2 
= — (tret)|7dQ . 10.141 
ag Â X tres) (10.141) 
Therefore, the angular distribution of the radiated power is 
dP(t;0 “ 
P.M ip Gaye: (10.142) 


dQ (477)?c 
At a given time t, we choose polar coordinates with the polar axis in the 
direction of d(tret), so | x A(tret)| = |d(tret)| sin 8. Then, 


dP(t; 0) Mo 212 
z Altre 0. 
dO Gee 


(10.143) 


Note that the angular distribution is independent of œ, because the set- 
ting is invariant under rotations around the direction of d (tret). We can 
rewrite this in terms of €9, as!9 


dP(t; 0) 1 1 |: nI 
= d (tre 0. 
TQ E [d(tret)|* sin (10.144) 
The integration over the solid angle is performed using 
2T 1 4 
i: ao | dcos@ sin? 0 = 2n x +. (10.145) 
0 -1 3 
We then obtain Larmor’s formula, either in the form 
P(t) = ie, (10.146) 
6rc 
or, in terms of €p, as?0:?1 
1 2 a 
PG) = ld tel 
(t) kass [d (tret )| (10.148) 


19We give the most important results 
both in terms of yo and in terms of 
co. In the latter case, factorizing a fac- 
tor 47€g allows us to quickly pass from 
SI units to Gaussian units, just setting 
4reo = 1, see Appendix A for details. 


20 Observe that this is the power radi- 
ated instantaneously at time t, through 
a sphere of fixed radius r, and has been 
obtained neglecting altogether |xo(t)| 
with respect to r. Correspondingly, we 
have approximated tret = t — (r/c). 
If one wants to integrate the power 
(10.148) over a given period of time, 
one must make sure that the condition 
|xo(t)| < r is valid all along the time 
period considered; if not, one must go 
back to eq. (10.127) and take into ac- 
count the actual time dependence of 
Ra(t). However, if we are interested in 
the formal limit r — oo, the issue does 
not arise. 


21 An alternative derivation, useful also 
in the generalization to higher mul- 
tipoles that we will study in Sec- 
tion 11.2.2, is obtained starting from 
eq. (10.132) and using the expression 
(10.120) for the transverse part of a vec- 
tor. This gives, for the power, 


i 1 ooa 
P= fa lä — ñ(ä-ñ) 


7 4neg 4rc3 


2 


tret 


1 dQ. pa 2 
=z | $ fär- aa] 
4Treoc3 An tret 


1 fa- id; [2 aih 
g A4regc3 ld r An eee t ` 
ret 


(10.147) 


The remaining angular integral can be 
performed using eq. (6.49), and we get 
eq. (10.148). 
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10.6 Power radiated by relativistic sources 


10.6.1 Relativistic generalization of Larmor’s 
formula 


After having discussed the power radiated in the non-relativistic limit, 
we now go back to the general results of Section 10.4, and we compute 
the radiated power in the full relativistic setting. The most straightfor- 
ward approach consists in taking the electric and magnetic fields in the 
radiation zone from eqs. (10.105) and (10.108). For the Poynting vector 
(3.34), at time t and position x, we then find 


1 x 
S(t, x) = ie PEL R, 


= coe( i Ra tër doaa , (10.149) 


4reo) R2 ct(1— Ry-v,/c) 


where, to keep the notation simple, we have omitted the arguments (t, x) 
in R,(t,x) and R,(t,x), and the argument t in v,(t) and v,(t). Given 
the time t and position x at which we observe the radiation, and given 
the trajectory xo(t) of the particle, eq. (10.45) determines the time tret 
at which the radiation was emitted, and therefore the position xo(tret) 
of the particle at the time of emission. By definition of R,(t,x), the 
distance between this position and the observation point, |x — xo(tret)|, 
is equal to Ra(t,x), see eq. (10.57). Now consider a sphere of radius 
R,(t, x) centered on the position xo(tret) of the particle at time of emis- 
sion. The power radiated per unit solid angle through this sphere is 


dE 


———— = A 2f 
dtdQ ay 
i ` \2 
1 1 q2 [Vr X (Ra~ vr/c)] x Ra 
= z - , (10.150) 
Arey C3 An (1 — Ry-v,/c)6 


where we denote by € the energy in the electromagnetic field. On the 
left-hand side, we have explicitly written the radiated power in the form 
dE /dt, to stress that this is the energy per unit time interval dt. This is 
the time relevant for the distant observer, located at a distance R, from 
the charge. This is the quantity that we have denoted simply by P in 
the previous chapters, > ji 

= 
When it is useful to stress that this is the power measured by the dis- 
tant observer, we will call it the “received” power, and we will use the 
notation P,. 

However, from the point of view of the charge that emits the radiation, 
dE /dtres is more relevant, since this determines the rate at which the 
charge loses energy, with respects to the time measured by an observer 
located at the instantaneous position of the charge (and at rest with 


(10.151) 
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respect to the distant observer); in particular, tye, is related to the proper 
time T of the charge by dr = dtre /y(v), where v is the instantaneous 
velocity of the charge, with respect to this observer. We call this the 
“emitted” power, and we denote it by Pe, 


_ dé 
E dtret l 


; (10.152) 
The relation between d/dt and d/dtret, for fixed value of the point x 
where the radiation field is observed, was already computed in eq. (10.100). 
Then, : 

P, = (1—R-v,/o)P,. (10.153) 


Which quantity is most appropriate, among P, and P,, depends on the 
physical situation considered. In the rest of this section we will focus on 
P, (using the more explicit notation dE /dtret, for clarity) which, as we 
will see, is Lorentz invariant and has an elegant expression that makes 
its Lorentz invariance explicit. 

Using eq. (10.153), and expressing the result in terms of tret by using 
R(tret, X) instead of R,(t,x) [see eq. (10.58)], 


. 2 
dE 1 1@ [vr x (R-v,/c)] x R 
dtretdQ  4reo c 4r (1 — R-v,./¢)5 


(10.154) 


where R = R(t;e,x). Note that everything here is expressed in terms 
of tret rather than t, since, as we saw in eqs. (10.59) and (10.107), v, = 
dxo (tret) /dtret and Vr = d?xo ret) / d tret: 

The integration over the angles can now be performed, writing R = 
(sin @ cos ¢, sin 8 sin d, cos 0), and gives 


dE 1 20° 6 = a3 [vr xvr? 
= 33 r st one |e) 1 ml 
dtre,  4Teo 3a? [vrl a (10.155) 
which we can also write as 
dE a. 20 g a2 |vxaļ? 
dtret ATE 3c3 c2 iii i (10.156) 


where a = v and a = |a|. In the limit v/c > 0 we have y > 1, while 
the second term in the bracket is suppressed by a factor (v/c)? with 
respect to the first. Then, writing q?a? = q°|žo|? = |d?|, we recover 
the Larmor formula (10.148) [note that, in the limit v/c — 0, P, is the 
same as P, = P, as we see from eq. (10.153)]. In this sense, eq. (10.156) 
is also referred to as the relativistic Larmor formula. Observe that the 
factor 7° means that the power radiated by a relativistic charge is highly 
enhanced, compared to the non-relativistic case. |vxal? = (€ijnvjak) (€iImviam), 

We can rewrite eq. (10.156) expanding? 


22 This is obtained by writing 


and using 


|vxal|? = va? — (v-a)?. (10.157) Cigh€ilm = 5;19km — jm4rt- 
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23 This follows from 


d 
v pa 4 Pa (CH *) 
dt dt 2 dt 
ld,» dv 
= (v2) =v—.(10.1 
zai ) va (10.159) 


Note that dv/dt = dļ|v|/dt and, of 
course, this in general different from 
a = ja| = |dv/dt|. As we see from 
eq. (10.159), the equality dv/dt = a im- 
plies v-a = va and therefore only holds 
when v and a are parallel. 


Then dE i Be 2 (ya)? 
q ¢ fa v-a 
= | ; 10.158 
At ret AT eg 3c3 í E c | t=tret l 
We can also use v-a = vdv/dt,?’ to rewrite this as 
dE 1 2? la v? (dv 2 
= l 10.160 
dtret  4reo 3c3 1 ya œd e ( ) 


Another useful form is obtained by decomposing a into its component 
parallel and transverse to v, as in eq. (10.118), a = ay + a,. Then 
i = aj + af, where a = |aj| and a, = |a_|, while va = vaj, so 


eq. (10.158) becomes 


dE _ 1 2g" A 
dtret 4m €9 3c3 


(a + yap) (10.161) 


t=tret 


An alternative and elegant derivation of the relativistic Larmor formula 
is obtained if one realizes that dE /dtret is a Lorentz-invariant quantity. 
To show this, consider an inertial frame K where the charge is instan- 
taneously at rest at a given value tret of retarded time (defined with 
respect to a far observer at rest with respect to K), so v(tret) = 0, al- 
though v(tret) Æ 0 since the charge is accelerating. In the frame K, we 
denote by dE the energy emitted by the charge in the interval between 
tret and tret + dtret (and which will therefore be seen by the distant ob- 
server in the interval between t and t + dt), and by dPem the radiated 
momentum. However, in the limit v/c > 0, and therefore for a particle 
instantaneously at rest, the radiation is given by the electric dipole term. 
The symmetry of the dipole radiation implies that the momenta carried 
away by the radiation in opposite directions are equal in magnitude and 
opposite in direction, so the momentum radiated in the interval between 
tret and tret + dtret vanishes, dPem = 0. 

Let K’ be a boosted frame, where the instantaneous velocity of the 
charge is vo, and let te, be the retarded time measured in this frame. 
As we have shown in Solved Problem 8.2, the energy and momentum of 
the electromagnetic field form a four-vector so, from eq. (7.49), in the 
frame K’ the radiated energy is 


dE! = 7(vo) (dE + B-dPem) - (10.162) 


However, since dPem = 0, we simply have dE’ = y(vo)dE. At the same 
time, at the position x = 0 where the particle instantaneously sits in 
the K frame, the transformation of the “local” time variables tret is 
dt. = Y(Vo)dtret- Therefore, dE’ /dtlo = dE /dtret, SO dE /dtye, is Lorentz 
invariant. 

We now use the fact that, in the K frame where the charge is instan- 
taneously at rest, the radiated power is given by the non-relativistic Lar- 
mor formula (10.148): indeed, given that in this frame v = 0, the higher- 
order corrections in v/c are identically zero, and the non-relativistic 
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Larmor formula becomes exact. Then, to find the radiated power in a 
generic boosted frame, it is sufficient to find a Lorentz-invariant expres- 
sion that reduces to the Larmor formula (10.148) in the instantaneous 
rest frame of the particle.2+ To this purpose, we consider the Lorentz- 


invariant quantity 
_ (By 
' \dr , 


where T is the proper time of the radiating charge and p is its rela- 
tivistic four-momentum, so p? = y(v)mc and p = y(v)mv. It is very 
natural to consider the quantity in eq. (10.163), when looking for a 
Lorentz-invariant expression that reproduces eq. (10.160). Indeed, a 
point particle, without internal structure, is only characterized by its 
four-momentum p” and by its proper time r. The invariant p,p" is 
equal to —m?c?, so it is just a fixed number, independent of the velocity 
and acceleration of the particle, while p,,dp"/dr = (1/2)d(p,p")/dr = 0. 
Therefore, the only quantity that can be formed, which is quadratic in 
the velocities and accelerations, is (dp,/dr) (dp“/dr). Writing dp°/dr 


and dp'/dr in terms of dv/dr and dv'/dr, where as usual v = |v|, we 
425 


(10.163) 


dp, dp” _ dp? ? 
dr dr dt 


ge 
1 dp, dp’ |1 (dvi(t)\? P (dv(t)\? 
m2 dr dr 10.166 
m? dr dr Y y2 dt Ce dt 7 ( ) 
Comparing with eq. (10.160), we see that 
dE -a 1 2q? dp, dp” 
dtre  4reo 3m2c3 \ dr dr J` (10.167) 


Equation (10.167) therefore provides an expression for the power radi- 
ated by a point charge in an arbitrary relativistic motion, equivalent to 
eq. (10.156), but written in an explicitly Lorentz-invariant form. Using 
dr = dtyo,/7, and recalling that u? = yc, we can rewrite eq. (10.167) as 


dE 1 2q? dp, dp” u? 
dr 4reo3m?ct \ dr dr ` 


We then recognize that this is the 4 = 0 component of a covariant 
equation, 


(10.168) 


(10.169) 


dP _ 1 2q? dp, dp” " 
dr  4reo 3m3 \ dr dr j 


where PH, = (E/c, Pem) is the four-vector describing the electromag- 
netic energy and momentum, radiated by a charge with four-momentum 
p” and four-velocity u = p/m. The spatial components of this equa- 


tion give the radiated momentum.”° 


24T he uniqueness of this covariantiza- 
tion procedure is ensured by the fact 
that, given the value of a quantity in a 
frame, in this case the power in the rest 
frame, and its transformation proper- 
ties under Lorentz transformation, the 
value in any other Lorentz frame is 
uniquely determined. 


25 Explicitly, dy/dr = (%?/2)vdv/dr, 
so 


BE ane (ae (10.164) 
dt c dr 
while 
i 3 i 
= = se oe tmy ~ . (10.165) 
Therefore 
1 dpy dp” _ 76 2 [dv 2 
m2 dr dr c2 ( ) 


c 
yt? A , A 
c? dr EAN T , 
where, in the second line, we used 
v'du'/dr = vdv/dr, see eq. (10.159). 
Using dr = dtret/y, see the discus- 


sion following eq. (10.150), we get 
eq. (10.166). 


261 should be stressed that these 
results have been obtained assuming 
that the charge that radiates is ex- 
actly point-like, since we have used 
the Liénard—Wiechert potentials, which 
make use of eqs. (8.1) and (8.2). So, 
while eqs. (10.167) and (10.169) might 
look as “exact” results valid for ar- 
bitrary relativistic motion, one should 
bear in mind that they are exact only in 
this, highly idealized, approximation. 
As we will see in Section 11.2, for an 
extended charge distribution the radia- 
tion emitted has a further dependence 
on all its higher-order charge and cur- 
rent multipoles, which give contribu- 
tions suppressed by higher powers of 
v/c. So, while in the non-relativistic 
limit the leading term is indeed given 
by the non-relativistic Larmor formula 
(10.148), the exact result at all orders 
in v/e is given by eq. (10.167) only 
in the approximation when one models 
the charge distribution as point-like. 
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27 Note that, in contrast, we always 
keep v > 0, otherwise we have to 
change correspondingly the definition 
of 6, if we want 0 to remain the an- 
gle between acceleration and velocity. 
When v = 0, we define 0 from Râ = 
cos 0, so the case v = 0 is obtained as 
the limit v > 0 with a > 0. 


28 Explicitly, 

[lv x (R—v/c)] x È 
=alv x (R — 6) xR 
=a(¥xR)xR 
=a [CRR - (R-R)o] 
= a(cosOR — ¥), 


where 8 = v/c, and we used ¥ x Ŷ = 0 
and eq. (1.10). 


10.6.2 Acceleration parallel to the velocity 


We now go back to eq. (10.154), to discuss in more detail the angular 
dependence in some simple cases. In this subsection we consider the 
situation in which the acceleration is parallel (or antiparallel) to the 
velocity. We then write 


(10.170) 


and we define 6 from R-¥ = cos 9, so 6 is the angle between the velocity 
and the direction of observation. The case where v and a are anti-parallel 
can be included choosing v > 0 and a < 0.2?’ Then, in the numerator of 
eq. (10.154),78 


[v x (R — v/c)| x R = a(cos 6R — V). (10.171) 
Therefore 
A a [2 
[vx (R—v/e)]}xR| = a?(cos?6+1—2cos’ 6) 
= @ sin’ @, (10.172) 


The angular distribution of the radiated power is then given by 


dPe parallel (tret ) = 1 ga? sin? 6 
dQ ~ Ame 4r (1— Bcosé)®’ 


(10.173) 


where P.(tret) = dE/dtret is the “emitted” power, see eq. (10.152), and 
the subscript “parallel” stresses that this result is valid when a is parallel 
(or antiparallel) to v. This distribution is very peculiar, for 8 —> 1. 
Indeed, in the forward direction, i.e., at 6 = 0, the radiation emitted 
vanishes exactly because of the sin? @ in the numerator. However, for 8 
close to one, the denominator strongly enhances the radiation at small 
angles. Therefore, the distribution has a peak at a value of 0 non- 
vanishing but very small and (taking into account that the distribution 
is invariant under rotations around the direction of the acceleration) the 
radiation is focused into a narrow cone close to the forward direction. 
Fig. 10.1 shows the function 


sin? 0 
(1 — 8 cos 0)5 ° 


for 6 = 0, 6 = 0.6 and 8 = 0.9. The maximum of the right-hand side of 
eq. (10.173), as a function of 6, is given by 


a1 4/14 52? 
38 


Inverting the relation y? = 1/(1 — 8?) we have 8? = 1 — 1/7?, so, for 


large y, 
1 1 
p=l=~ea+0 (=) 
272 4 


(9) = (10.174) 


cos Omax = (10.175) 


(10.176) 
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Inserting this into eq. (10.175) and expanding the numerator and the 
denominator to first order in 1/77, we get 


(y> 1). (10.177) 


This confirms that, for large y, the peak of the distribution is at an angle 
Omax Such that cos Omax is very close to one, i.e., Omax is very close to 
zero. Writing cos Omax & 1 — (1/2)02 ax, we get 


1 1 
+3 +0(5). 


The angular width of the peak is also of order 1/7. Indeed, in the limit 
6<1l andy > 1 we can write 
sin? 0 6? 
(1 — Bcosé)> (1 — 8 + 802/2)5 
0272 

CET 
where, in the second line, we used eq. (10.176) (note that we made no 
assumption on the product 0y). Therefore 


dP. parallel (tret ) ~ 1 8q°a? 8 0272 
dQ ~ Areo me? 7 (1 +0272)’ 


(10.178) 


Omax = 


~ 328 (10.179) 


(y>1,0«1). 


(10.180) 
From this expression, we can verify again that the maximum is at 
0y = 1/2, and we see that the width A0 of the distribution is of or- 
der 1/7. Therefore, when the acceleration is parallel to the velocity, 
and the particle is highly relativistic, the radiation is focused into a 
very narrow cone in the direction of motion, peaked at an opening angle 
Omax ~ 1/(2y), and with a width of order 1/y. 

Observe that the case of acceleration antiparallel to the velocity can 
be simply obtained replacing a — —a in eq. (10.170). However, since 
eq. (10.173) is unchanged under a — —a, the result is the same when 
the particle is accelerated or decelerated in the direction of its velocity. 
The latter situation typically takes place when a relativistic electron 
hits a target, that rapidly decelerates it. The corresponding radiation, 
that classically is described by eq. (10.173), is called bremsstrahlung, or 
“braking radiation.” More generally, bremsstrahlung takes place because 
of the acceleration of a charge in the Coulomb field of another charge 
and is also called free-free emission. 

The integration of eq. (10.173) over the solid angle is given by an 
elementary integral, 


sin? @ 1-2? 


1 2r 3 
Jef aa = ETEF 


_ T 1 3 
7 3 (se) 
TT 


| gÊ. (10.181) 
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Fig. 10.1 A polar plot of the func- 
tion f|(@) for 6 = 0 (upper panel), 
8 = 0.6 (middle panel), and 6 = 0.9 
(lower panel). The direction of a 
(and, when non-zero, of v) is shown 
by the arrows. All plots are un- 
changed if we invert the direction of 
the acceleration. The full distribu- 
tions in three-dimensional space are 
rotationally symmetric around the 
horizontal axis, corresponding to the 
fact that the distributions are inde- 
pendent of the polar angle ¢. Note 
the difference in scales between the 
three panels. 
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Therefore, the emitted power radiated instantaneously when the accel- 
eration is parallel (or anti-parallel) to the velocity is 


1 2a? 
4reo 3c3 


Pe parallel = (10.182) 


For y + 1 this reduces to the Larmor’s formula (10.148), as it should. 
We could have also obtained this result from eq. (10.160), using the fact 
that, when the acceleration is parallel to the velocity, dv/dt = d|v|/dt 
becomes the same as a = |dv/dt|. 

Instead of expressing the power in terms of the acceleration, it can 
be useful to express it in terms of |dp/dt|, i.e., of the force applied to 
the particle in order to accelerate it (equivalently, we could use dp/dt = 
ydp/dt). Proceeding as in eq. (10.164) and using the fact that, when the 
acceleration is parallel to the velocity, the modulus of the acceleration 


29 Explicitly, is related to the modulus of the velocity by a = dv/dt, we have”? 
1 dp _ d(yv) dp 
md — dt = mya, (10.183) 
_ 4, ae 
B dt ; V and therefore J 
= YZ a+ya T =7ma,  (allv). (10.184) 
= 28 , 
= ne Using this to eliminate a in favor of |dp/dt| from eq. (10.182), we get 
1 22 |dp|? 
Pe arallel =- a3 3a a 10.185 
parallel — “Arey 3m2 | dt ( ) 


10.6.3 Acceleration perpendicular to the velocity 


We next consider the case in which the acceleration is perpendicular 
to the velocity, as for a particle accelerated in a circular ring, which 
is a common situation in particle accelerators. In this case we set the 
instantaneous velocity v along the z axis, and the acceleration a along 
the x axis, so 

v=v2, a=ax. (10.186) 
We define the polar angles 0, ¢ with respect to the z axis, so the generic 
direction of observation is given by 


R = sin 0 cos 6x + sin @ sin dy + cos Oz. (10.187) 


Note that we still have 


R-v = cos6, (10.188) 


as in the setting of the previous subsection. However, now the numerator 
of eq. (10.154) depends also on the ¢ angle: carrying out the triple vector 
product, 


[x x (R — 64)] x R = [sin? 6 cos? ¢ — (1 — 8 cos 6)|& 
+sin? 0 sin dcos¢y¥ + sin 0 cos ġ(cos0 — 8)z, (10.189) 
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and, taking the modulus squared and combining the various terms, 


[x x (R— 64)| xR = (1—8 cos 0)? — (1 — 8°) sin? 0 cos? œ. (10.190) 


Inserting this into eq. (10.154) and writing 1 — 8? = 1/7?, we get 


d Pe circ (tret ) AS 1 gar 1 
dQ ~ Amey 4nc3 (1 — Bcos6)3 


sin? @ cos? ¢ 
7?(1 — 8 cos 6)? | ’ 


(10.191) 
where we added the subscript “circ” to the emitted power P, to stress 
that this is the result when, instantaneously, a L v, so, in particular, 
for a circular motion. 

In Fig. 10.2 we show, for 6 = 0,0.6, and 0.9, the function 


1 (1 — 8?) sin? 0 


f.(8) = (1 — Bcos@)3 (1 — Bcos@)? |’ 


(10.192) 


that, according to eq. (10.191), determines the distribution dP.jz-/dQ in 
0, in the plane ¢ = 0. 
Integrating eq. (10.191) over the solid angle, we get 


1 2¢q?a? 


Pecite = Treg 303 


$. (10.193) 


Comparing with eq. (10.182) we see that, for fixed acceleration a, the 
power radiated when the acceleration is parallel to the velocity is larger 
than that radiated in a circular motion, by a factor y?. However, it 
is usually more significant to compare the power radiated for a fixed 
external force. To this purpose, similarly to eq. (10.184), we can rewrite 
Pairc expressing a in terms of dp/dt. For circular motion, the modulus v 
of the velocity does not change (so y does not change) and dp/dt = yma, 
so 


(a Lv). (10.194) 


Using this to eliminate a in favor of |dp/dt| from eq. (10.193), we get 
1 2 , 2 
4Treo 3M? 


dp 
dt 


Feic = (10.195) 


Comparing with eq. (10.185) we see that, at fixed |dp/dt|, the situation 
is opposite and now Pe circ is larger than Pe parallel by a factor y?. There- 
fore, much more external power is needed to overcome radiation losses 
in circular accelerators, compared to linear accelerators. 

In the non-relativistic limit 8 — 0, eq. (10.191) becomes 


dP. circ (tret ) 1 qa? (1 
dQ ~ Arey 4rc3 


sin? 0 cos? @) . (10.196) 
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Fig. 10.2 A polar plot of the func- 
tion f1(@) for 8 = 0 (upper panel, 
the same as the upper panel in 
Fig. 10.1, with the acceleration now 
set on the vertical axis), 8 = 0.6 
(middle panel), and 6 = 0.9 (lower 
panel). The direction of a (and, 
when non-zero, of v) is shown by the 
arrows. All plots are unchanged if 
we invert the direction of the accel- 
eration. Note the difference in scales 
between the three panels. 
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3°The cyclotron is a particle accelera- 
tor, invented by E. Lawrence in 1929— 
1930, where the charged particles are 
kept in an outward-bound spiral orbit 
by a magnetic field, and are acceler- 
ated by a time-varying electric field. 
The fast temporal variation of the elec- 
tric field is synchronized with the out- 
ward inspiral motion of the particle, 
so that the particles undergo several 
cycles of acceleration. This is possi- 
ble only as long as the particle is non- 
relativistic, and the frequency for the 
motion in a magnetic field, which in 
general is given by w = qB/my [see 
eq. (8.201)], reduces to w = qB/m 
and becomes independent of the veloc- 
ity (and is indeed called the cyclotron 
frequency). To keep accelerating the 
particles when they become relativistic, 
the most successful solution turned out 
to be to keep the particle in a circu- 
lar orbit of fixed radius by increasing 
the magnetic field during the accelera- 
tion phase; this led to the synchrotron. 
Cyclotrons were the most powerful par- 
ticle accelerators until the 1950s, when 
they were superseded by synchrotrons 
(and other variants of the idea, such as 
synchrocyclotrons), but are still used 
in medical applications. Currently, 
the largest accelerator in the world is 
the Large Hadron Collider (LHC) at 
CERN, which is a synchrotron-type ac- 
celerator. 


0.6 


0.4 


0.2 


0.0 


-0.2 


-0.4 


-0.6 1 
-1.0 -0.5 0.0 0.5 1.0 


Fig. 10.3 A zoom of the middle 
panel of Fig. 10.2, with 8 = 0.6. 


The radiation emitted in the non-relativistic limit by a charged particle 
in a circular or quasi-circular orbit (which, in practice, is obtained when 
the particle is moving in the plane perpendicular to an external magnetic 
field) is called cyclotron radiation. The limit 8 > 1 (i.e., y > oo) of 
eq. (10.191) defines instead synchrotron radiation.®° Once again, as 3 > 
1, the radiation is focused into a narrow forward cone, because of the 
focusing effect of the denominator. Note, however, that now eq. (10.191) 
is non-vanishing for 6 = 0. Just as we have done for eq. (10.180), we 
can expand eq. (10.191) for y + oo and @ + 0, without the need of 
assuming anything for the product 0y, writing 


62 
a) 


1 
= rare + 6°") ) 


1—6cos? ~ 
(10.197) 


where we used eq. (10.176). Then, for y > 1 and 0 < 1, eq. (10.191) 
becomes 

AP. circ (tret ) A 1 2q?a? 76 46?-y? cos? Q 

dQ ~ Aneg mo? (1+ 6?y7?)3 (1 + 0272)? 

[to be compared with eq. (10.180)], so now the maximum is at 0 = 0, 

while the width of the distribution is again A@ ~ 1/7, as is also seen 


from Fig. 10.2. Observe that, for 0y > 0, the term in brackets vanishes 
when 


| , (10.198) 


1 + (87)? = 2(67)| cos 4] . (10.199) 
Solving the second-degree equation (10.199) with respect to 07, gives 


by = |cos ġ| + y cos? @- 1. 


This has real solutions only if cos? = 1, and then 0y = 1, so the 
distribution vanishes for 0 = 1/y and ¢é=0 or ọ =r. 

In Fig. 10.3 we show a zoom of the the middle panel of Fig. 10.2, 
where 8 = 0.6 and cos¢ = 1, that shows that the distribution indeed 
vanishes at a critical value of 0, and also has a smaller lobe, mostly in 
the backward direction, but partially tilted forward. 


(10.200) 


10.7 Solved problems 


Problem 10.1. Dipole oscillating along the z axis 


As a first simple application, we compute the radiation emitted by an elec- 


tric dipole with charge q, oscillating with amplitude a along the z axis, 
d(t) = qaZcoswet. (10.201) 


We assume aws < c, so the non-relativistic formulas of Section 10.5 apply. 
We compute the radiation in the direction n, given in terms of 0 and @ by 


h = (sin @ cos ¢, sin 8 sin d, cos 0) . (10.202) 


Then, using eq. (10.136), 


2 
qaw; 1 
E(t,x) = o cos[ws(t — r/c)] 


x [—X sin 8 cos 0 cos @ — ¥ sin 0 cos O sin ¢ + ĉsin? 6]. (10.203) 


First of all observe that, in the dipole approximation, E(t, x) oscillates in time 
at a frequency w equal to the frequency ws of the source. We also observe that 
the electric field vanishes on the z axis, i.e., for 0 = 0. The same holds for 
the magnetic field, since cB = ñ x E. Clearly, there is no radiation in this 
direction since there is no acceleration of the source transverse to the z axis. 
In the (x,y) plane we have 0 = 7/2, and we see from eq. (10.203) that E is 
linearly polarized along the z axis. For the angular distribution of the power 
radiated, eq. (10.144) gives 


dP 1 ga?w* 


aO = Tg re cos” [w(t — r/c)] sin” 0, (10.204) 


where we used the fact that the frequency of the source, ws, is the same as the 
frequency w at which the electric field oscillates. If we perform a time average 


over one period, using 7 
1 i 1 

— dacos? a = = , (10.205) 

2T Jo 2 


the factor cos? [w(t — r/c)] is replaced by 1/2, and 


dP Ll gaw. 
(Gq) = Ga ae sin? 0. (10.206) 
Performing the angular integral, we get 
1 gaut 
(P) = 1e (10.207) 
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Observe that, in the dipole radiation from a periodic source with frequency 
ws, the radiated power grows as w4 (which, for dipole radiation, is the same as 
wt, with w the frequency of the radiation). This is due to the fact that each 


derivative brings a factor of ws, so d x w? and P x d? x wi. 


Problem 10.2. Radiation emitted by a non-relativistic charge in 
circular orbit 


As the next application, we consider a charged particle moving counter- 
clockwise on a circular orbit of radius a in the (x, y) plane, with frequency ws 
(for instance, the particle could be kept in a circular orbit by the action of an 
external magnetic field). We assume again that the velocity v = wsa < c, so 
we can use the non-relativistic approximation to the particle motion, and we 
can use Larmor’s formula, which is valid to lowest order in u/c. We write 


xo(t) = a(coswst, sin wst, 0) , (10.208) 


so 
d(t) = —qaw? (cos wst, sin wst, 0) . (10.209) 


The radiative part of the electric field is obtained again from eq. (10.136), and 
we compute the radiation emitted in the direction of the unit vector ñ, given 
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in terms of the polar angles 0,¢ by eq. (10.202). Then, eq. (10.136) gives 


qaw? 


E,(t,7,0,¢) = Ger [cos wetret — sin? 0 cos ġ cos(Wstret — ¢)| , (10.210) 
TE 
qaws 2 

E,(t,r,0,ġ) = Ta [sin wstret — sin“ sin dcos(wstret — ¢)| s (10.211) 
TE 

2 
E.(t,r,0,¢) = “Gait sin ð cos 6 cos(wstret — $), (10.212) 
TE 


where as usual, to leading order in a/r retarded time becomes tret = t — (r/c). 
First of all, we observe that, just as in the case of a dipole oscillating along the 
z axis, the frequency w at which the electric field oscillates is the same as the 
frequency ws at which the source rotates. As we will see in more generality in 
Section 11.2, for dipole radiation the frequency of the electromagnetic waves 
generated by a monochromatic source is indeed always equal to the frequency 
of the source (as we will see, in general this is no longer true for the radiation 
generated by higher-order multipoles). 

Along the positive z axis, where 0 = 0, we get (writing henceforth w instead 
of ws) 

qaw? 
(4reo)er 
Therefore, in this direction, light is circularly polarized, with the electric field 
rotating counterclockwise with respect to the +z direction. This corresponds 
to right circularly polarized light, see eq. (9.84). At 0 = v, the result for E 
is still given by eq. (10.213). However, now the propagation direction is —Z 
and, with respect to this propagation direction, eq. (10.213) is left circularly 
polarized light. If we rather set 6 = 7/2, so that we look at the radiation 
emitted in the (x,y) plane, from eqs. (10.210)—(10.212) we get 


2 


E(t,7,0 = 5,4) = ace sin(wtrer — ¢)(—sin@,cos@,0). (10.214) 


E(t,r,0 = 0) = (cos wtret, sin Wtret, 0). (10.213) 


Therefore, in this case the electromagnetic radiation is linearly polarized along 
the direction of the tangent vector to a circle in the (x,y) plane (oriented 
toward the counterclockwise direction). 

To compute the radiated power we first observe, from eq. (10.209) that 


|d(t)|? = Paws (10.215) 
is actually independent of time. Then, according to eq. (10.144), the radiated 


power is also time-independent, and given by 


dP(@ Ll gaw" . 
O ia sin? 0, (10.216) 


where we used ws = w to write the power in terms of the frequency of the 

radiation. The sin? @ dependence of the power is typical of dipole radiation, 
as we see from the general result (10.144). Performing the angular integral, 

1 2q7a?wt 

= — —__.. 10.217 

Areo = 308 ( ) 

Note that this is twice as large as the result in eq. (10.207), corresponding to 

the fact that a circular motion in the (x, y) plane can be seen as a superposition 
of two oscillators, one oscillating along the x axis and one along the y axis. 


Problem 10.3. Radiative collapse of a classical model of the hydro- 
gen atom 


Historically, the problem of the radiation emitted by a charge in a circular 
orbit was also important to show the inadequacy of a classical model of the 
atom. Consider a classical model of the hydrogen atom, with the electron 
moving in a circular orbit around the proton. The system radiates electro- 
magnetic energy to infinity according to eq. (10.217). This energy must be 
taken from the mechanical energy Emech of the system, which therefore must 
decrease, according to?! 


dEmech 
P=- ; 10.21 
dt (10.218) 
The mechanical energy is given by 
1 9» 1 g? 
mech = ; 10.21 
Emech g” 4neo r ey) 


where r is the relative distance between the electron and the proton and m 
the reduced mass. Actually, since mp ~ 2000m., to good approximation we 
can take the proton at rest at the origin, and m ~ me. 

For an electron in a circular orbit of radius r we have v = wr and the 
modulus of the acceleration a is |a| = w*r, so F = ma gives Kepler’s law 


2 


2 e if 
= ; 10.220 
4reg mr? ( ) 
Then 
_ 29 b 
Emech ia £ Aten r 
i e 
= = — 10.221 
Areo 2r’ (10 ) 


which is the result, familiar from elementary classical mechanics, that in the 
Coulomb potential the kinetic energy is 1/2 of the absolute value of the po- 
tential energy. Note that the total energy is negative, as it should for a bound 
state. In order to balance the energy radiated, Emech must decrease, i.e., must 
become more negative, so, according to eq. (10.221), r gets smaller until, 
eventually, the electron collapses on the nucleus. 

The radiated power (10.217) has been computed under the assumption that 
the electron is kept by an external force on an exactly circular orbit. In 
the case of the (classical!) hydrogen atom, we have just seen that r will 
actually decrease to compensate for the emitted radiation, so r = r(t). Still, 
eq. (10.217) remains a good approximation to the actual radiated power as 
long as the induced radial velocity 7 is much smaller, in absolute value, than 
the tangential component of the velocity, i.e., as long as |7| < wr; furthermore, 
the use of eq. (10.217) is justified as long as the motion is non-relativistic, so 
wr & c. We will check the validity of these conditions a posteriori. Let us 
compute the evolution of r(t) in this regime. Combining eq. (10.217) (with 
q = —e) with eqs. (10.218) and (10.221) we get 


2e7w tr? d (—e? 
333 T ( ) ; (10.222) 


Using eq. (10.220) we can rewrite it as 


, 4 1 eyri 
Pac ma (=) = (10.223) 
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31% Section 12.3.3 we will provide a 
more accurate justification of this con- 
servation equation. 
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This integrates to 


= 7(0)-—< ( e ) t, (10.224) 


m2c3 \ 4reo 
where r(0) is the value of the radius at t = 0, that we take equal to the 


Bohr radius of the hydrogen atom, rp œ 0.53 x 1078 cm. Therefore, in this 
approximation, the electron collapses on the nucleus in a time 


-limer ( "TB (10.225) 
T= 4 B g2 F P A 


We could directly plug in the numbers, and find 7 œ 1.5 x 1071} s. Actually, 
it is more instructive to rewrite eq. (10.225) introducing some combinations 
that enter at the quantum level, even if our computation is purely classical. 
An important combination is the fine structure constant, already introduced 
in eq. (5.34), 
-l e 

ST Atreo he’ 
This quantity is dimensionless, and, numerically, a ~ 1/137. We borrow from 
quantum mechanics the information that the Bohr radius is given by 


(10.226) 


jee 7 a (10.227) 
Therefore, we have the identity 
mers (=) = =. (10.228) 
and eq. (10.225) can be rewritten as 
1 rep 


T= at (10.229) 
Note that rg/c is the time that light takes to cross the size of the atom. It 
is an extremely small number, about 0.53 x 107'°m/(3 x 108 m/s) ~ 1.8 x 
10-'° s. Even if 7 is enhanced, with respect to this, by a factor 1/a* ~ 
(137)*, we still get the very small value ~ 107'! s given previously. This 
means that, in classical electromagnetism, atoms are unstable to emission of 
electromagnetic radiation, and would collapse in about 1071! s! One can 
check, from the previous expressions, that the conditions |7| << wr and wr < 
1 both break down only at r ~ arg, so, when the size of the atom has 
become about 5 x 10~-°rg, which is already of order of the size of the nucleus. 
Therefore, the conclusion on the collapse of the atom is not an artifact of these 
approximations (furthermore, a relativistic particle radiates away energy even 
faster). 

This instability to emission of electromagnetic radiation is one of several 
difficulties of a classical model of the atom and shows that, at these scales, 
classical physics is inadequate. As we now know, at these scales the correct 
description is provided by quantum mechanics. 


Radiation from localized 
sources 


In the previous chapter we discussed the radiation generated by a point- 
like charge, with arbitrary motion. We now expand on the previous 
discussion, studying the radiation generated by a localized, but otherwise 
generic, distribution of charges and currents. This will also provide 
the basis for the multipole expansion of the radiation field, that we 
will present in Section 11.2, while in Section 11.3 we will give a first 
simple discussion of relevant scales defining the near and far zone. The 
dynamics of relativistic particles in the near zone will be discussed in 
more detail in Chapter 12, where we will tackle the rather technical 
issue of how relativistic effects, and the back-reaction due to the emitted 
radiation, affect the dynamics of the sources in the near zone. 


11.1 Far zone fields for generic velocities 


11.1.1 Computation in the Lorenz gauge 


We begin by computing the radiation emitted by a generic charge dis- 
tribution. In this subsection we perform the computation in the Lorenz 
gauge (we will compare with the computation in the Coulomb gauge in 
Section 11.1.2). We therefore start from eq. (10.34), 


[aa ght yee [x — x’|/c, x’) 


Ix — x’| 


(11.1) 


where we have set to zero the solution of the homogeneous equation, 
representing incoming radiation, since we want to compute the radiation 
produced by the current j”. Note that eq. (11.1) is valid for sources with 
arbitrary velocities, since we have used the exact (retarded) Green’s 
function of the d’Alembertian. As in Chapter 6, we denote by d the 
size of the spatial region in which the source is localized. Therefore, in 
eq. (11.1), J” (tret, x’) vanishes for |x’| > d, so the integration variable x’ 
is effectively limited to |x’| < d. We write, as usual, |x| = r and x = rn, 
and we use the large r expansion (6.3). We are interested only in the 
radiation field, that decays as 1/r. Then, in eq. (11.1) we can replace 
1/|x — x’| simply by 1/r, since the subsequent terms of the expansion 
will give contributions of order 1/r? and higher. In contrast, in the 
argument of j” we must also keep the next term, which does not vanish 
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lWe will see in Section 11.2 that, 
for non-relativistic sources, the integral 
gets a significant contribution only for 
values of the integration variable w such 
that the term wi-x’/c is indeed small, 
and we will be able to expand the ex- 
ponential in powers of it. For generic 
source velocities, however, this term 
must be kept in its full form. 


2 Explicitly, 
£: 
= u — = 59: r 
= i - 7 
ie) 
1 
= = =a ij — Ni a ’ 


where we used eq. (6.12). 


for r + oo. Thus, 


1 nx! 1 
A" (t,x) = foe bali" (: ; nx x’) O (=) . (11.2) 


We now perform a Fourier transform with respect to time, writing 


d ; 
pt») = f er" plw,x), (11.3) 
| ee 
tnx) = f Eej. (11.4) 
Then, restricting to the 1/r term, and recalling eqs. (8.9) and (8.12), 
= Br! petal —iw[t—(r/c)+n-x’ /c] 
g(t, x) Ta A ~ fa Y w, x')e , (11.5) 
dw rA A # 
A(i,x) = Alfer [Bie ee PP Veer. (11.6) 
An r 2 


From these expressions it is clear why, in the exponential, we could not 
neglect the term n-x’, that comes from eq. (6.3). Even if r >> |n-x’|, 
still the fact that, inside a phase, wr/c is much larger than wn-x’/c is a 
priori irrelevant, since phases are defined only modulo 27.' In contrast, 
the term O(d?/r) in eq. (6.3) goes to zero as r — 00, and is therefore 
negligible even in the phase. 

We can now compute the electric field in the far zone, using eq. (3.83). 
Let us compute first Vd. Recalling that x = rn, in eq. (11.5) the V 
operator acts on the factor 1/r in front, as well as on the factors r and 
n in the exponential. Again, we are only interested in the part of E that 
decreases as 1/r. The derivative of the overall 1/r factor gives a term 
proportional to 1/r?, and we can neglect it. Similarly, 


Ge wee e = (Ojn,)a', g wens. (11.7) 


and? 1 

Therefore, when combined with the overall 1/r factor in front of the 
exponential, the term obtained from O;n; gives again an overall contri- 
bution of order 1/r? as r + oo. The only contribution to the radiation 
field therefore comes from the derivative of the factor r in the exponen- 
tial. Notice that this factor is a consequence of the retardation effect. If 
it were not for this retardation term, there would be no 1/r term in the 
electric field, and therefore no radiation at infinity. Using 


deere = (iw/c)(O;r) gierie 
= (iw/e)n,e"!*, (11.9) 


where we used again eq. (6.12), we see that, to order 1/r, 


(11.10) 


Volt, x) 
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The computation of 0A/0t is straightforward and, keeping again only 
the 1/r terms, we get 


E(t,x) = ~ fa Ba! /< dw iw iuli (r/o) +x! yc] 


= r On co” 


x ox’) = chip(w,x)] 


iL d > 
JE W etot f da 1 eT iwhx'/c [iw,x’) — caw, x)] 


= Areo? r J 2r 
1 1 fdw _, x 
—~ t+ |% o-iwlt-(r/c)], i a) rAz r | 
TAR ~ [Se iw |j(w,wn/c) — mp(w,wn/c)} , (11.11) 


where, in the last equality, we expressed the result in terms of the full 
Fourier transform with respect to both time and space, 


plt,x) = J - Ei eiat lw, k), (11.12) 
j(t,x) = I = oi eer jw, k), (11.13) 
see eq. (1.105). The inversion of these Fourier transforms gives 
lw, k) = Jide is p(t, x), (11.14) 
jw, k) = I dtd? x et j(t, x), (11.15) 


see eq. (1.106). We now observe that the continuity equation (3.22), 
written in Fourier space (for w and k generic), becomes 


wp(w,k) = k-j(w,k). (11.16) 


In eq. (11.11), however, k and w are related by k = wii/c, and in this 
case the continuity equation gives 


cp(w, w/c) = A-j(w, wii/c). (11.17) 
Then, we get 
1 dw ew t—(r/c)], 
BEANS Aree? r JE- m iw 


TESNE cos 


According to eq. (10.120), the expression in braces is just the component 
of the vector j(w,wñ/c) transverse to the momentum k = wñ/c or, 
equivalently, to the unit vector n, 


LESES ae] co 


Using eq. (10.122), we can also rewrite it in the equivalent form 


i (u =) = —Ax faxi (a =)| . (11.20) 
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Then, the 1/r part of the electric field generated by a localized source 
can be written as 


1 1 di ; x 
E(t, x) = = f Eee iwjı (w,wû/c). (11.21) 
ATEC? r Qn 
We finally observe that iwe~*“' = —0,e~**'. Then, transforming back 
to j(t,x), 
1 1 dw _; 
E(t = ra) —iw[t—(r/c)] 
ee) Ategc? r : on © 


x [erae iwt iuie x’). (11.22) 


We can carry out the integrals over dw and dt’, writing 


1 1 dw A Pete at 
E(t = a “8? 3 dt! : t! 1 OM —iw(t—-r/ce—t' +â- x'/c) 
(axi i ay fas ja( f 
= La f alt 1, x) oft — (t-—r/e+n-x'/c)]. 
Amey C2 
(11.23) 
We therefore arrive at our final result, 
E(t,x) = -——, —0, | @ a’ ji—r/c+i-x’/c,x’), (11.24) 
-r r 
or, in terms of jug, 
E(t,x j= Bea fae! ji(t—r/ce+n-x'/c,x’). (11.25) 


We see that the radiation field is generated by the time-varying compo- 
nent of the current transverse to the line of sight. The only approxima- 
tion made to arrive at this result is that we have kept just the term that 
at large distances decreases as 1/r, while we have made no assumption 
on the motion of the charges, that can be fully relativistic. This gener- 
alizes the result found in Section 10.5, e.g., eq. (10.131), where we found 
that, to lowest order in v/c, and for a single point charge, the radiative 
component of the electric field is generated by the component of the 
acceleration of the charge transverse to the line of sight. Note also that 
t—r/c+n-x'/c is just tret, in the large r limit that we have used. 

The magnetic field B = V x A can be computed similarly, from 
eq. (11.6). Writing B; = €;;,0;A,, again the only term proportional to 
1/r is obtained when 0; acts on the factor e’”’/° in eq. (11.6), while the 
action on 1/r and on the factor n that appears in e~twirx'/e gives terms 
of order 1/r?. Using eq. (11.9), we get 


1 d > ; ef 
B(t,x) = Ho fer fE iw a x j(w, x’) elt- (r/c)+â-x /c] 
T C 


1 d LU ~ - A at 
a fae [SF “Fw, x’) elt C/o+#'/l! (11.26) 
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Comparing with the first line in eq. (11.11) (and using û x j = f x j1 ) 
we see that, in the far region, 


B=ñxE. (11.27) 


Once again, observe that this is valid for sources with arbitrary velocities, 
i.e., is an exact result for the 1/r part of the electric and magnetic fields. 

It can also be useful to observe, from eq. (11.21), that the Fourier 
transform with respect to time of E(t, x) is given by 


z 1 iw eiwr/c 2 
E = ——_ j n 11.2 
(w, x) ATEO cr JL (w, wû/c) ’ ( 8) 
and, correspondingly, 
7 +) piwr/c 7 
Bw x) = Oe ax ji wwie), (11.29) 
Ar cr 


where we recall that x = rn. 


11.1.2 Computation in the Coulomb gauge 


It is instructive to repeat the computation in the Coulomb gauge, where 
the potentials obey eqs. (3.93) and (3.94). Let us start from ¢. For a 
time-independent p(x), we found the solution in eq. (4.16), using the 
Green’s function of the Laplacian. This is immediately generalized to a 
source p(t,x), since the Laplacian only acts on the spatial coordinates, 
so the solution of eq. (3.93) with a time-dependent source p(t, x) is 


1 p(t, x’) 
c= Ba! 11. 
ge = gi f dal E (11.30) 
In the large r limit, 
1 1 
o(t,x) = 2.0 l (11.31) 
4Teo r r2 


where Q is the total charge of the system. Therefore, the leading term in 
V¢ goes as 1/r? and does not contribute to the radiation field. Indeed, 
we already remarked above that the 1/r term in the electric field comes 
from the retardation effect that is responsible for the term e’"/¢ in 
eqs. (11.5) and (11.6). In the Coulomb gauge, where ¢ satisfies a Laplace 
equation, #(t,x) is determined by p(t, x’) at the same time t, i.e., is an 
instantaneous, rather than a retarded, solution. Correspondingly, there 
is no contribution from ¢ to the radiation at infinity.* 

Let us turn now to eq. (3.94). Using eq. (11.30) together with the 
conservation equation (3.22), we get 


1 V fer jt, x’) 


Arreg |x — x!| 


o 
atts x) = 


Plugging eq. (11.32) into eq. (3.94) we get 


A = — Ho {it + fy. fate! zeit} . (11.33) 


(11.32) 


3 Furthermore, this contribution is time 
independent, since, for a system of 
charges moving in a bounded region, 
without charges flowing through the 
boundary coming from (or escaping to) 
infinity, the total charge is conserved. 
The first time-dependent term in the 
expansion of eq. (11.30) for large r 
comes from the dipole 4reoġ(t,x) = 
Q/r + d(t)-#/r? +.... The first time- 
dependent term in ¢ is therefore pro- 
portional to 1/r?, and contributes to a 
time-dependent term O(1/r?) in E. 


“One might be puzzled by the fact that 
(t,x) is determined by the instanta- 
neous, rather than retarded, value of 
the charge density, since this seems to 
violate the postulate of Special Relativ- 
ity that information cannot be trans- 
mitted faster than the speed of light. 
However, one should not forget that 
the gauge potentials are not directly 
observable quantities. The observable 
quantities are the electric and mag- 
netic fields. Since these are gauge in- 
variant we already know, before per- 
forming it explicitly, that the compu- 
tation of E and B in the Coulomb 
gauge will give the same result that 
we found in the previous section in the 
Lorenz gauge. In particular, the elec- 
tric field at large distances will be given 
by eq. (11.24), where indeed the electric 
field at time t depends on the behavior 
of the source at time tret (that, in the 
large r limit, we have approximated by 
t—r/c+n-x'/c). 


5 Explicitly, 
ə 
(me) 5796x) 
= 3g! 1 Or p(t, x’) 
|x — x’| 


where we integrated V by parts us- 
ing the fact that we are considering a 
source localized in space, so j(t,x) has 
compact spatial support. 
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6 Observe that the quantity 
j.(w,wh/c) that appeared in the 
Lorenz gauge computation, and that 
was defined in eq. (11.19), is the 
transverse part of j(w,wi/c) with 
respect to the unit vector n, i.e., it 
satisfies 


âj (w,wh/c) =0. (11.40) 


This is just the Fourier transform, with 
respect to both time and space, of the 
condition 

V-ji(t,x) =0, (11.41) 


that defines the quantity jı (t,x) that 
appears in eq. (11.35). In fact, from 
eq. (11.40) we get 

0 = Aj. (w, wh/c) 
[eae ewt —iwt-x! /c 
xnji(t’,x’) 


ic dr dt! eivt! 


x Vee ee ji (t',x’) 
-2 faa eivt’ —iwi-x'/c 
Ww 


xV ji (t, x’), (11.42) 


where in the last line we have integrated 
by parts, assuming that jı (t’, x’) is lo- 
calized. Therefore, A-j,. (w,wn/c) = 0 
is equivalent to Vx-j. (t,x) = 0. 

There is actually a subtlety for w = 0, 
since, from eq. (11.42), the vanishing of 
Vj (t,x) only implies the vanish- 
ing of wñ -j1 (w, w/c), and this is auto- 
matically satisfied if w = 0, without the 
need of imposing nf-j, (w, wñ/c)|w=0 = 
0. However, a Fourier mode with k = 
wh/c 0 corresponds to a spatially 
constant term, which is eliminated by 
the boundary condition that j(t,x) is 
localized in space. 


We now use the decomposition of a vector field into its longitudinal and 
transverse parts. As discussed in detail in Solved Problem 11.1, a general 
vector field V(x) can be decomposed as 


V(x) = Vi (x) + Vi (x), (11.34) 
where 
V(x’) 
< 3 at 
Vi(x) = Vx lv x fa x oes i (11.35) 
V(x’) 
L te . 3 al 
V(x) = V lv fa x a A (11.36) 
The decomposition is such that V-V, = 0 and V x Vį = 0, and 


defines the transverse and longitudinal components of the vector field. 


In Fourier space, eqs. (11.35) and (11.36) become 
Vi (k) = V(k) - k | V(k)| Vik) =k |k-V(k)| , (11.37) 


and these quantities satisfy k-V | (k) = 0, k x V\ (k) = 0 (see Solved 
Problem 11.1 for details). So, eq. (11.33) can be written as 


A= —Ho [i(t, x) — jy (t,x)] , (11.38) 
and, since j(t,x) = j1 (t,x) +j (t, x), 
A = —mojı (t,x). (11.39) 


Notice that this equation is consistent with the fact that we have derived 
it in the Coulomb gauge V-A = 0, precisely because, on the right-hand 
side, the source satisfies V-j, = 0. Therefore, the quantity denoted by 
j1 (t,x) in the previous section is precisely the divergence-less part of 
the vector field j(t, x), defined as in eq. (11.35). 

Equation (11.39) can be solved using the Green’s function method. 
As in eq. (11.1), we use the retarded Green’s function (10.24) of the 
d’Alembertian, so 


Ho 
4n 


Performing the Fourier transform of jı with respect to time, we can 
rewrite this as 


-r fgpy f% 
A(x) = 2 fae [se 


We next take the large r limit, keeping only the terms that will con- 
tribute to the 1/r part, so 


A(t,x) = 


pes Pa a ee : (11.43) 


x— x’ | 


iuo(t— fx’ [/c) Jw!) 
pease] 


(11.44) 


dw 
A(t, x) = F ~ fz JZ eTit- r/e+ûñx'/c)} (w, x’) 
= Hof deity eTit- r/c) 3 ,/ e7 iwt x! /cF 
An JE pes ji @,x’) 
1 
= & E f js (owo). (11.45) 
AT r 27 
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Finally, we have seen that, in the Coulomb gauge, the term —V@ does 
not contribute to order 1/r, so keeping only the term O(1/r) we get 


II 
| 
| 


E(t, x) 


2 I gates 


— 11 ie dw 
ATOT Qn 
This agrees with eq. (11.21) [and therefore also with eq. (11.24)], as it 
should for a gauge invariant quantity computed in two different gauges. 


,wi/c) (11.46) 


ewlt-(r/e)] iw js (w, wn/c) . (11.47) 


11.1.3 Radiated power and spectral distribution 


As we saw in eq. (3.35), the energy radiated per unit time through an 
infinitesimal surface ds is given by ds-S, where S is the Poynting vec- 
tor, given by eq. (3.34). We consider a sphere at large distance from 
the region where the source is localized, with its origin inside the lo- 
calization region, so ds = r?dQ ñ, where ñ is the unit normal in the 
radial direction, and we take the large r limit. For the electric field we 
then use eq. (11.25), which is exact as far as the term 1/r is concerned 
while, for the magnetic field, we have seen the 1/r contribution is given 
by eq. (11.27). Then, to order 1/r?, the exact result for the Poynting 
vector is 


1 
S = —En 
Moc 
= inn + ay f Pa'ia ji (tret, X x) n, (11.48) 
or, in terms of €, 
il 1 o 1 z A 
= Areo 4nc3r? Ot f da ji(tret,)) Â. (11.49) 


The power d€/dt radiated in the solid angle dQ is obtained performing 
the scalar product with ds = r?dQ.n,’ so 


dE Lo a , 2 
uaa oe J FP iek a (11.50) 
or, in terms of €9, 
vii = Tr a j nf 11.51 
didn A4ne€g 4nc3 s | da’ ji (tret, x’) (11.51) 


7Observe that here we are considering 
the power per unit time t, rather than 
per unit retarded time tret, i.e., the “re- 
ceived” power P, see eq. (10.151) and 
the discussion below it. Observe also 
that we denote the energy by E, reserv- 
ing the symbol F for the modulus of the 
electric field. 
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8 Explicitly, 


+020 
fairer 


— o0 


IIn the physics literature, it is also of- 
ten called the Parseval theorem. More 
accurately, the Parseval theorem is a 
discrete version of this result, based on 
Fourier series rather than on Fourier in- 
tegrals. 


10 Explicitly, 


+00 
/ dt ett 
—co 
xr f da ja (trees) 
+020 
=- f dt 
—co 
x f Ba j (tees!) 


+00 R 
= -iw f dt" 


—oo 


x f da! ja (trots!) 


+020 
= ziw fa f dt 


xet” (tret +r/e— n-x’/c) 


(aret) 


ji (tret, X J 
= uge fates /c 


+00 set 
x f dtret e“ "ret ji (tret ; x’) 
—oo 


= —iwe™"/ ej (w, w/c), 


where we have used the expression of 
tret valid at large r, tret = t — r/c + 
n-x’/c, to express t in terms of tret, 
and then the fact that, at fixed x’, 
JES dt = fede dtret. 


Consider now a source that acts only for a finite amount of time. In this 
case, the total radiated energy per unit angle is finite, and is given by 


dE _ ft a€ 
d [. dtd 
l 2 = 3,/ 3 ! s 
= Ineo Anc J dt afa x Li (trot, X’) (11.52) 


We now use the fact that, given a square-integrable function f(t), we 
have the identity’ 


+00 +00 
Ja WOR] 
= boo 27 
which is known as the Plancherel theorem. 9 Tf, furthermore, f(t) is real, 


then flw) = f*(—w), so |f(w)|? = |f(— aye, and eq. (11.53) can be 


(11.53) 


written as 
Fod 2 ®© dw z, 5 
HOR =2) zz IEO (11.54) 
—oo 0 
We can apply this to the real (vector) function 
=a fa'i ji (tret: X’), (11.55) 


that appears in eq. (11.51). (Note that this function is real; the modulus 
that appears in eq. (11.51) refers, of course, to the modulus in the sense 
of vectors, |f|? = fifi). Its Fourier transform is given by!? 


f(w) = —iwe™"/*j 1 (w,wti/c) . (11.56) 
Then, from eqs. (11.52), (11.54), and (11.56), 
dE 1 1 dw 
11. 
dQ A4neg 203 a on “lin, wii/e)|* cnet) 


Note that the Fourier modes j, are complex, and j 1|? is now a notation 


for |(j.):G1)il, ie., we take the modulus square of the vector and the 

modulus of the complex number. We can rewrite eq. (11.57) in the form 
dE 1 1 27 ay 

is Fetes Or (11.58) 

or, in terms of po, 
dE HO a FS 2 
= 11: 
Toda pag ILa e 


This gives the total energy radiated by the source, per unit solid angle 
and unit frequency, i.e., the energy spectrum per unit solid angle. The 
total energy spectrum is obtained integrating over the solid angle, 


= = 2 faai (w,wa/e)P 
U 


Ho w 
1673c 


(11.60) 
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Observe that the angular dependence, over which we integrate, enters 
through the unit vector in the radial direction, n. In polar coordinates, 
dQ. = dcos 6d¢, and 


h = (sin @ cos ¢, sin 0 sin ¢, cos 0) . (11.61) 


Writing the result in terms of j ı gives a nicely compact expression. 
Furthermore, we see from eq. (11.28) that the direction of E(w,x) is the 
same as that of j 1, so this expression also allows us to easily read the 
polarization of the radiation. Two alternative expressions, however, can 
also be useful. First, using eq. (11.20) and observing that, for any vector 


Vy 


|ax(Axv)| = |ûxv] , (11.62) 
we can rewrite eq. (11.58) ast! 
dé _ 1 1 wax wri 
ddw  4reo 4023 TUT (11.65) 


Alternatively, we can rewrite the result in terms of j and p, using 
eq. (11.19) and observing that the continuity equation in momentum 
space, eq. (11.16), gives 


> 


n-j(w,wn/c) = cp(w,wn/c) , (11.66) 


and therefore 
jı (w, wit/c) = j(w,wh/c) — Acp(w,wi/c) . (11.67) 


[We have indeed simply undone the passages that led from eq. (11.11) 
to eq. (11.18)]. Then, suppressing temporarily for notational simplicity 
the argument (w, wñ/c), 


ix? = Q- faep)-Gj — ñcp)* 
= [2 + lp? — 2Rel[ep*al]. (11.68) 
From eq. (11.67), together with ñj, =0, it follows that 
h-j = cp, (11.69) 
and therefore eq. (11.68) becomes 
ial? = GP - A). (11.70) 
Therefore, we can also rewrite eq. (11.59) as 
dE z 
Foie ea (iewi -ewi L7 
or, in terms of €ọ, 
dE 1 1 2 (3 mw 2 ajs A j 
= — : 11.72 
Hide = ng eae” (Gowo -elw wir/e))?) - | (01.72) 


1lNote that we have defined the spec- 
tral density dE /dwdQ as the quantity 
that gives the total radiated energy per 
unit solid angle, when integrated over 
dw from w = 0 to w = oo (rather than 
from —oo to +00), see eq. (11.57). This 
is also called a “one-sided spectral den- 
sity.” If one rather uses a “two-sided 
spectral density,” which is the quan- 
tity that gives the the total radiated 
energy per unit angle when integrated 
over dw from w = —oo to w = œ, then 
eq. (11.58) is replaced by 
dE 1 


1 g? x 
dQdw = Areo 812203 w Ha (w, w/e) 
(11.63) 


ee 


and eq. (11.65) is replaced by 
dE 1 
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!2Note that here we are only con- 
cerned with the “internal motions” of 
the source, i.e., a superposition of pe- 
riodic motions, with non-zero frequen- 
cies w’, localized within a bounded re- 
gion of space (for which, therefore, the 
Fourier transform is well defined). Any 
bulk motion of the center-of-mass of the 
source can then be included perform- 
ing first the computation of the elec- 
tromagnetic field in a reference frame 
where the center-of-mass is at rest, and 
then performing a boost, similarly to 
what we did in Section 10.3 for a point 
charge. 


11.2 Low-velocity limit and multipole 
expansion of the radiation field 
We now go back to the expression for the large-r limit of the gauge 


potentials. We work in the Lorenz gauge, so we use eqs. (11.5) and 
(11.6), that we rewrite here, inverting the order of the integrals, as, 


1 1 dw’ Sah a Pigs oa 
t = z —iw -rta fas Ime I oA piw Ax’ /c 11.73 
ot») = goo | fe x! plu!,x/)e , (11.73) 
1 ifd _.. “ Lact 
A(t,x) = —— J Z gi 0/8 faba! jo! x’) eT A/e (11.74) 
4reoc? r) 2r 
We have put a prime also over w, to stress that it is an integration 


variable. To obtain these expressions we had assumed that the source, 
given in general by a distribution of moving charges, is localized in a 
region of space |x’| < d, and we kept only the leading term in the limit 
r > d, which is O(1/r). On the other hand, the computation of this 1/r 
term was exact. In particular, we made no assumption on the velocity 
of the particles that creates this distribution of charge and current. We 
now consider the case in which the velocities of the charges are non- 
relativistic, v & c. 

Consider first the case in which the source performs a simple har- 
monic motion, with angular frequency ws confined to a region of size d; 
for instance, a point charge in a circular orbit of radius d and angular 
frequency ws. The typical velocity v of such a charge is of order wsd and 
the condition v/c < 1 becomes 


wsd 


<1. (11.75) 


c 

More generally, a source with more complex internal motions will be 
characterized by a superposition of Fourier modes, and will generate a 
distribution of charge density and current density which is not monochro- 
matic, but rather described by p(w’, x’) and j(w’, x’). The non-relativistic 
limit is applicable when the Fourier modes f(w’,x’) and j(w’,x’) are 
sizable only for values of w’ such that w’d/c < 1, and go quickly to 
zero for larger values of w’. This means that only values of w’ such that 
w'd/c < 1 contribute significantly to the integrals over dw’ in eqs. (11.73) 
and (11.74).!* At the same time, the integration over dèx’ in eqs. (11.73) 
and (11.74) is restricted to the region |x’| < d, since the source is local- 
ized. Therefore, the term w’n-x’/c, that appears in the exponentials in 
eqs. (11.73) and (11.74), is always much smaller than one (in absolute 
value) whenever the rest of the integrand is sizable, and we can expand 
the exponentials in powers of it. As we will see, this gives rise to an 
expansion in time-dependent (or “radiative” ) multipoles. 

Before entering into the technicalities of this expansion, let us better 
understand its physical meaning. As we will confirm below with the 
explicit computation, the frequency w of the radiation emitted by a 
source whose motion has a typical frequency ws, will also be of order of 


11.2 Low-velocity limit and multipole expansion of the radiation field 279 


ws (with a numerical coefficient of order one that depends on the order 
of the multipole expansion, see below; we already saw in Problems 10.1 
and 10.2 that, for dipole radiation, w = ws) and therefore w ~ ws ~ v/d. 
In terms of the reduced wavelength A = c/w defined in eq. (9.56), this 
means 


xv £d. 
U 


(11.76) 


In a non-relativistic system we have v < c, and therefore the reduced 
wavelength of the radiation emitted is much larger than the size of the 
system that generates it: 


= >d. (11.77) 


non-relativistic sources 


When the reduced wavelength is much larger than the size of the system, 
we expect that we do not need to know the internal motions of the 
source in all its details, and only the coarse features of the distribution 
of the charge and current densities, as encoded in their lowest multipole 
moments, should be relevant. This physical intuition will be confirmed 
by the computation that we perform below, where we will verify that 
the expansion in powers of w'ñ-x’'/c gives rise to an expansion in time- 
dependent multipole moments. We will then compare with the expansion 
in static multipole moments that we performed in Chapter 6. 

We now perform explicitly the expansion of the scalar and vector 
potentials.'? From eq. (11.73), 


1 1 faw yg 
t = kash aa piw wre ff Bal xf, fal 
ox) = poi f[ Pe da plu!,x') 
iw’ . 1 —iw! Boy fea 
x {1 hitit 7 fishjaje,; +...) (11.78) 
The first term gives simply the total charge q of the source, 
du! _ yy 
x ew ea faa pw’, x’) 
dw y 
= f dat [pe (tr/e) 5! , x’) 
= fèro —r/c,x’) 
=q(t—r/c). (11.79) 


However, we assumed that the motion of the source is localized inside 
a finite volume, so no charges are escaping to infinity (or coming from 
infinity). Therefore the charge is actually time independent, 

g(t —r/c) =q. (11.80) 


The second term in the expansion is related to the time derivative of the 


13Note that, if we are interested only 
in the electric and magnetic fields, we 
could directly perform the expansion in 
powers of w’n-x’/c on the electric field, 
starting from eq. (11.22), and recover 
the magnetic field in the far region from 
eq. (11.27). However, the structure of 
the expansion is somewhat clearer, con- 
ceptually, starting from the gauge po- 
tentials. Furthermore, especially at the 
quantum level, one can be interested in 
the multipole expansion of the gauge 
potentials themselves. 
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140ne should not confuse the compo- 
nents of the dipole moment, that we de- 
note by d;, with the size of the system, 
that we denote by d. 


electric dipole of the charge distribution. Indeed, 


[Sere [bx a) (=) Aa! 
27 C 


a R $ seth 
= z s = (719 f da pers 
MRE i etal 
c xJ’ CA 
a 
= = ai(t r/c), (11.81) 


where the time-dependent electric dipole moment of a charge distribu- 
tion is defined by 


d(t) = J Ex oft)’ (11.82) 


which generalizes the definition (6.8) that we introduced for the static 
case.!4 The third term of the expansion can be transformed as follows: 


dw” —iw' (t—r/c) 30 x7 I ol tw! ? An me Po 
P dx p(w", x’) 7 Aijt 


Ain; 0? 
= Si Ea | Ex ot- r/ex aie. 


(11.83) 


The result therefore depends on the second moment of the charge dis- 
tribution, 


D;;(t) = J dx p(t, x’ Jæja) . (11.84) 


Actually, just as in the static case, it is also convenient to introduce the 
quadrupole moment of the charge distribution, which is defined by 


Qut) = J Ex (t,x) (3aja', — dij|x’|?) - (11.85) 


Equation (11.85) is the generalization of eq. (6.18) to a time-dependent 
charge density. Comparing eqs. (11.84) and (11.85), 


1 1 
Diş (t) = 3i (t) + 3 ois [ea |x’ |? p(t, x’) ; (11.86) 
For the moment, we will however write the result in terms of Dij, which 


gives a more compact expression. Putting together the various terms, 
we get 


(11.87 
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The analogous expansion for A is obtained from eq. (11.74). 


1 1 ; 
Aj(t,x) = Gees J da’ jilt- r/e, x’) 
aa 
HLS | Prit- raxt 


(11.88) 
The electric field in the radiation zone is obtained inserting these ex- 
pressions for the gauge potentials in eq. (3.83), and the magnetic field 
in the radiation zone is then obtained from eq. (11.27). We are only 
interested in the terms O(1/r) in E, and in the corresponding terms in 
B, since these are the only ones that give a contribution O(1/r?) to the 
Poynting vector and therefore to the radiation flux at infinity. Let us 
begin by considering the contribution of V@ to E. When the gradient 
acts on the overall 1/r factor in eq. (11.87) it produces a term O(1/r?), 
which does not contribute to the radiation field. Then, in particular, the 
Coulomb term proportional to q/r in eq. (11.87) does not contribute to 
the radiation field, and 
B ladit E ! Mi Dylt-r/) +.. | 40 4 
(11.89) 
Similarly, from eq. (11.8) we see that when the gradient acts on the 
factors ñi, Aiñj, and so on, that appear inside the brackets, it generates 
an extra 1/r term, so that, overall, the corresponding contribution to E 
is again O(1/r?). Therefore, we only need to apply the gradient to the 
multipole moments, 
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4reor |c 
1 
+O (=) : (11.90) 


The multipole moments depend on the spatial variables only through r 
and, for a generic function f(t — r/c), we have 


Vif(t-r/od = nr f(t — r/c) 


= -Z aft- r/o), (11.91) 
so in this case the gradient does not produces extra factors of 1/r. Then 
û ifl. ; 1 fight; 
Vo = Rae EZK r/c) + 5 -~a Pult- r/o) +... 
1 
+0 (=) ‘ (11.92) 


There is, therefore, an infinite series of terms contributing to the electric 
field at order 1/r, generated by the higher and higher multipoles. These 
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contributions correspond to an expansion in powers of v/c. In fact, for 
a source with a generic structure, for dimensional reasons each higher- 
order multipole carries one more power of the only length-scale in the 
problem, which is the typical size d of the system. For instance, the 
monopole is just the total electric charge q; apart from dimensionless 
numbers, the electric dipole d; is of order qd, the quadrupole Qij is of 
order qd’, etc. Furthermore, each time derivative produces one power 
of the typical frequency of the system, so, for instance, d; (t—r/c)~ 
wsqd ~ qu, where v ~ wed is the typical velocity of the internal motions 
of the source. Similarly, di(t— r/c) ~ w2qd ~ qu?/d, and Dj;(t—r/c) ~ 
wqd? ~ quv3/d. Then, we see that, in eq. (11.92), 


ieee q v? 
Ío a i q v3 


and so on: each further term in the bracket has one more time derivative, 
giving an extra ws factor; an extra factor d coming simply because each 
subsequent multipole moment has an extra power of xê in its definition; 
and an extra factor 1/c from the expansion of g te ies fe leading, overall, 
to an extra factor of order w,d/c ~ v/c with respect to the previous term. 

Similar estimates can be applied to the contribution of A to the elec- 
tric field. From eq. (11.88), 
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+a f Erit- roxat . (11.95) 
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We now observe that j is of order qv, so, in order of magnitude, 


wWsqu qu 


i 
ea s(t = r/c, x’) Phi a ~ d a ; (11.96) 
and . Zand P 
Thi ao. wequ qu 
post —r/e,x')r,~ e rT (11.97) 


and so on for the higher-order terms. In summary: 


e The structure of the multipole expansion of the radiation field is 
quite different from the multipole expansion of the static fields 
studied in Chapter 6. In the static case, we were interested in 
the fields at distances r large compared to the size d of the sys- 
tem, and we wanted to compute the corrections, subleading in d/r, 
with respect to the leading Coulomb term. These corrections are 
parametrized by the multipoles of the source, which in this case 
are time-independent and are therefore often called the static mul- 
tipoles. For instance, the electric dipole produces a contribution 
to the scalar potential ¢ of order 1/r?, see eq. (6.9), compared to 
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the 1/r behavior of the Coulomb potential. In terms of the electric 
field, the leading term is the Coulomb term proportional to 1/r? 
and the expansion in static multipoles has the general structure 


e~4 fiso(2)+0(4)+..]. (11.98) 


In contrast, in the multipole expansion of the radiation field, the 
gauge potentials and the electric and magnetic field receive an 
infinite series of contributions from all multipoles, already at order 
1/r, associated with terms of higher and higher order in v/c. As 
we have seen in eqs. (11.93) and (11.96), this expansion starts from 
order v?/c?, and vanishes in the static limit v = 0. Schematically, 
for the electric field 


B~24lo(5) Lo (5): (11.99) 


For static fields v/c — 0, and therefore the 1/r term vanishes, 
leaving us with the expansion (11.98) that starts from order 1/r?. 


e The multipoles that appear in the expansion of the radiation field 
are functions of retarded time t — r/c, and are sometimes called 
“radiative multipoles,” to distinguish them from the static multi- 
poles that enter in the expansion of the static fields. It is precisely 
the dependence on retarded time (and therefore, eventually, the 
fact that light travels at a finite speed) that is responsible for the 
appearance of the 1/r terms in the radiation field. Electromagnetic 
radiation is a property of a relativistic theory. 


e The would-be leading term in the expansion of the scalar poten- 
tial (11.87) is the Coulomb term [1/(4reo)] (q/r). However, this 
term does not depend on time, and gives no contribution to the 
1/r part of the electric field at infinity. The leading term to the 
radiation field at infinity coming from the scalar potential is there- 
fore the dipole term, while the leading contribution from the vec- 
tor potential is given by the term proportional to the current in 
eq. (11.88). Comparing their contributions to the electric field, 
we see that these two terms are of the same order in v/c, compare 
eqs. (11.93) and (11.96). Indeed, we will see in the next section that 
both terms can be expressed in terms of the electric dipole. The 
next-to-leading terms are given by the electric quadrupole term in 
eq. (11.87) and the term involving j;(¢ — r/c, x')x} in eq. (11.88), 
which are both suppressed by an extra power of v/c, compared to 
the leading terms. 


Having understood the general structure of the multipole expansion 
of the radiation field, in the next subsections we will study in detail the 
leading and next-to-leading terms. 
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11.2.1 Electric dipole radiation 


As we have seen, since the term proportional to q/r in eq. (11.87) does 
not depend on retarded time, when taking its gradient to compute E we 
get only a contribution of order 1/r?, which therefore does not contribute 
to the radiation field. The terms giving the leading contributions to the 
radiation field are obtained setting 


1 1.. 
o(t,x) = Ta gõdilt -= r/e), (11.100) 
and, from eq. (11.88), 
— 1 1 Bata 1 
Aj(t,x) = bao dx jilt — r/c, x). (11.101) 


This expression for A; can be rewritten in terms of d; by making use 
of the obvious identity kzi = 6,; and using the conservation equation 
(3.22), 
pee jilt, x) = foen, X)ôk Ti 
= - | ezita) 
+04 | Peol x)a: 
= d,(t). (11.102) 


In the second line we integrated by parts, discarding the boundary term 
because the source is localized. Note that these are the same manipula- 
tions that we used in eq. (6.31), except that there we were considering 
the static case, so the time derivative vanished. Therefore, to leading 


order in v/c, 
a dilt — r/c). (11.103) 
4neg c2r 
We see that both ¢ and A, to this order in v/c, are determined by the 
time derivative of the electric dipole. The corresponding radiation is 
therefore called dipole radiation. 
The corresponding contributions to the electric field, keeping only the 


terms O(1/r), are given by 


A; (t, x) = 


n 1 


Vo= To =, ôidi(t r/c), (11.104) 
as we have already computed in eq. (11.92), and 

OA 1 1: 

aE ne, oF d(t— r/c). (11.105) 


Therefore, for dipole radiation 
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We have therefore recovered eq. (10.136), as the lowest-order term in an 
expansion in v/c. We can then rewrite this result in the equivalent forms 
(10.132) or (10.135). Observe also that the retarded time that appears 
in these formulas is tret = t — r/c, where r is the distance to the fixed 
origin of the coordinate system, which has been set inside the region 
where the source is localized. The variation of the delay across different 
points of the sources, which is expressed by the factors e7 Âx'/c jin 
eqs. (11.73) and (11.74), has been already taken into account, to this 
order in v/c, by the expansion of the exponential. 

The radiative part of the magnetic field is given by cB = nx E, since, 
for the 1/r part of the electromagnetic field, eq. (11.27) is an exact re- 
lation valid to all orders in v/c. The situation is therefore completely 
analogous to that discussed in Section 10.5, with the only distinction 
that, in Section 10.5, which was only concerned with the lowest-order 
term for large distances and small velocities, one could have used differ- 
ent equivalent forms for the distance to the source and for retarded time, 
compare for instance eqs. (10.127), (10.130), and (10.131). In contrast, 
in the context of the systematic multipole expansion discussed here, the 
only natural choice is to use r for the distance to the source and, corre- 
spondingly, t — (r/c) for retarded time. In particular, for the Poynting 
vector we write 


1 
S = —FE’nA 
Hoe 
a jä t-r/ | ñ (11.107) 
= 5 r . á 
4reo 4nc3r? = í 


Then, the power per solid angle dP/dQ radiated in the dipole approxi- 
mation is written as 


dP 1 1 


” 2 
a di (t | 11.1 
dQ  4rego 4r | ul =R) ( 8) 
and the total radiated power is 
1 2 ą 
P(t) = — — |d(t — d 
(t) TTA zala -= r/o)", (11.109) 


compare with eqs. (10.144) and (10.148). We have therefore rederived 
the Larmor formula (10.148), as the lowest-order term in a systematic 
expansion in v/c of the 1/r contribution to the electric and magnetic 
fields. 


11.2.2 Radiation from charge quadrupole and 
magnetic dipole 


We now look at the next-to-leading term. For ¢, this is given by the sec- 
ond moment of the charge distribution, denoted by D;j, see eq. (11.87), 
while for A; it is given by the term proportional to j;(t — r/c, x')x} in 
eq. (11.88). We can rewrite the latter in a physically more transparent 
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form with a straightforward generalization to the time-dependent case 
of the manipulations performed in eqs. (6.33) and (6.34), 


IED = J Paziti Orc 
= - | èz x; On [U5 Ju (t, X)] (11.110) 
= - | Praij(tx) +o | eol xain, 
where, again, we have used V-j = —0;p. Therefore 
pee Lis(t, x) + j; (t,x)2i] = Diz (t). (11.111) 
We then split j;(t,x)a,; into its symmetric and antisymmetric parts, 
feite = 5 f Palito thea 
+5 f Erta- jlte], (11.112) 
and we define the antisymmetric tensor 
mij(t) = 5 | a [wag GX) — xzjji(t,x)] (11.113) 


The magnetic dipole m;(t) is then defined as 


1 
mM = cik Mjk(t), (11.114) 
and therefore 1 
m(t) = 5 pes xx j(t,x). (11.115) 
Equation (11.114) can be inverted, as usual, as Mij = €i;,7™. Therefore 
1. 
J Peita = Pia (t) = EijkMg(t) . (11.116) 


Equations (11.115) and (11.116) generalize eqs. (6.36) and (6.37) to the 
time-dependent case. 

Therefore, including only the leading and next-to-leading order in v/c, 
@ is given by eq. (11.87), while, A is obtained from eq. (11.88), 


i= no (11.117) 


. 1. i 1 . z 
x ut r/c) 4 zult r/c)ħj + ~eigntng (t — r/c) ite ; 


where we used eq. (11.102) for the leading term. We now write 


Di;(t) = 5 Quit + dij f(t), (11.118) 
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where f(t) can be read from eq. (11.86). Then, up to next-to-leading 


order, 
(4meo)o(t,x) = 24 Lnd —rje) + Oyl — r/e) 
+e) A (11.119) 
(Areo?)Ai(t,x) = Iht- r/c) + 2 -Oult — r/o), 
t eyl - r/o)ûr + ARL 
(11.120) 


We next observe that 


Af(t—r/e) = [Af(t—r/c]d(t — r/c) 


= = ero: (11.121) 
SO, ; 
f-r/c)_ fi; 1 
Oj ap = Brel r/c) +O 72 | (11.122) 
Therefore, to the order 1/r at which we are working, we can rewrite 
1 q 1 a 2 nin; s 
t = H idi(t ay ig (t — 
Bx) = gee |B iadt- ro + dyt- r/o) 
8 | f(t=r/e) 
ôt ii ae) 
1 


Aj(t, x) 


Taga [pit -rA + gut -r/d 


~0; os] (11.124) 


Comparing with eq. (3.86), we see that the extra terms proportional to 
f corresponds to a gauge transformation with 
f(t-r/o) 
A(t, r) = ——_.. 11.125 
(67) (4reo)2rc? ( ) 
Since gauge-equivalent potentials give the same electric and magnetic 
fields, we can simply drop these extra terms, and write 


1 1 K Ser 

é(t,x) = ca: f “idil r/c) + 2 Öy(t r/o) , (11.126) 
1 1 Vx Ía 

A;(t, x) => aoe = uc r/c) + go “is tt r/c)ûj 


1 
+e eignting (t — rons] . (11.127) 
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Note also that, again to leading order in 1/r, 


= o(4). (11.128) 


since Of(t—r/c) = 0. Therefore, under this gauge transformation, as far 
as the 1/r terms are concerned, the gauge potentials A, remain in the 
Lorenz gauge (that we have used in the computations of this section), 
see the discussion below eq. (9.18). 

As we have already discussed, the term q/r in ¢ does not contribute 
to the radiation field, so the leading term is given by the contribution of 
a time-varying dipole, both in ¢ and A. We see that, at next-to-leading 
order, the radiation is generated by the time variation of the quadrupole 
moment of the charge distribution, and of the magnetic dipole moment. 

The expansion could in principle be carried out to generic order. We 
already understand, from the computation at this order, that there will 
be two families of multipole moments that contribute: the multipole 
moments of the charge distribution, such as the charge dipole, charge 
quadrupole, and so on; and the multipole moments of the current dis- 
tribution, of which the magnetic moment is the lowest-order term. 

From the gauge potentials we can now compute the electric field up 
to next-to-leading order. Introducing the vector 


Qi = Qijħj, (11.129) 
the result is 
E(t, x) = a S TT ET T E 
a ATEC? r 6c c icije , 
(11.130) 


where the subscript indicates that all quantities on the right-hand side 
must be evaluated at t — r/c. To lowest order, we recover the dipole 
electric field, in the form given in eq. (10.135). The magnetic field in 
the radiation zone is then obtained, as usual, from cB = n x E. The 
power radiated, integrated over the solid angle, can be obtained with a 
computation analogous to that in eq. (10.147). It can be more convenient 
to use eq. (1.9) to rewrite E as 


1 oil 


A4negc? r 


E(t,x) = {id n(d-a)} + ae — n(Q-a)} + Las a} 

6c c t—r/c 

(11.131) 

We can then compute |E|? and integrate it over the solid angle. Re- 

calling that Qi = Qij;n; contains a factor of A, we see that, when we 

integrate over dQ (and make use of n-n = 1), most of the integrals 

can be performed with the identity (6.49), except for the square of the 

quadrupole, that has up to four n factors. These can be performed with 
the identity 

eis z = Ce eee ee (11.132) 
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Similarly to eq. (6.49), this identity can be obtained most efficiently by 
first fixing the tensor structure from symmetry arguments: the right- 
hand side must be a tensor, symmetric in all its pairs of indices, con- 
structed with combinations of two occurrences of 4;;. The overall co- 
efficient is then obtained by contracting among them the indices (i, j) 
and (k,l). Note that the integral over dQ of an odd number of ni fac- 
tors, such as f dQ n; or f dQnjnjng vanishes because the integrand is 
odd under n — —f (and the integration over the sphere, for each ñ, 
contains also —n). In this way we see immediately that the mixed term 
between the electric dipole and the magnetic dipole, and between the 
electric dipole and the quadrupole vanish. The mixed term between the 
quadrupole and the magnetic dipole, after integration over the angles, 
becomes proportional to €ijkQijMmk, which vanishes because Qij is sym- 
metric. Therefore, the radiated power separates into an electric dipole 
term, an electric quadrupole term, and a magnetic dipole term, without 
mixed terms. The remaining computation is straightforward, and gives 


P(t = d T 
9 4TEo 303 | 180c® a 


The first term was already obtained in eq. (11.109). As we already saw in 
the estimates above eqs. (11.93) and (11.94), Qij is smaller than d; by a 
factor O (v). Also, the magnetic dipole is smaller than the electric dipole 
by a factor O(v), as we can see comparing their respective definitions, 
eqs. (11.82) and (11.115) and recalling eq. (3.26) (which is valid for a 
point charge but extends to a charge distribution, with v the typical 
velocities of internal motions).'’° Then, we see that both the electric 
quadrupole and the magnetic dipole contributions to the radiated power 
are smaller by a factor O(v?/c”), with respect to the power radiated by 
the electric dipole. This is, of course, as expected, since we have seen 
that the expansion in radiative multipoles is an expansion in powers 
of v/c, and the terms in eq. (11.133) are quadratic in the multipole 
moments. 

For a point charge evolving on a prescribed trajectory xo(t), inserting 
eq. (8.1) into eq. (11.85), we get 


Qi (t) = q [3x0i(t)x0;(t) — |xo(t)| diz] - (11.134) 


Consider, for example, a charge performing a simple harmonic motion 
along the z axis, with amplitude A, 


x(t) = ZA cos(wet). (11.135) 
Then, the only non-vanishing components of Q;; are 
Qu = qA? cos? wet, (11.136) 
Qa = =q? cos? wet, (11.137) 
Q33 = 2qA? cos? wet. (11.138) 


Since cos? wst = [1 + cos(2wst)|/2, the time-dependent part, that con- 
tributes to the derivatives Q in eq. (11.131) oscillates as cos(2w.t), and 
therefore the quadrupole radiation is at a frequency w = 2w.. 


15We will see this in more details in 
eqs. (12.92) and (12.93). 
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11.3 Near zone, far zone and induction 
zone 


In the previous section we have studied the large r limit, assuming that r 
is sufficiently large, so that only the terms order of 1/r must be retained, 
and the 1/r? terms can be neglected. In this region, we have found that 
the radiation field is given by plane waves, with the electric and magnetic 
field transverse to the propagation direction, sourced by the time-varying 
multipole moment. 

Now we ask how large must r be so that this expansion is valid. 
Consider for instance the scalar potential ¢. From the form (11.126), 
that was already obtained keeping only the term O(1/r), we see that 
the various terms in the expansion have the form F(t — r/c)/r, such as 
d;(t — r/c)/r, Qij(t — r/c)/r, and so on, multiplied by factors ñi. The 
electric field is obtained from V¢@. In particular, this produces terms of 
the form 


OF(t—r/c)  F(t-r/c) 190 


ay 7 = 2 pF tt r/c). (11.139) 
Writing u = t — r/c, 
o _ dF(u) ðu 
E 1 dF(u) 
= A, (11.140) 
while 
o dF(u) ðu 
ae Fe) du ot 
_ dF(u) 
= =, (11.141) 
and therefore, 5 ra 
gz Etro == r/o. (11.142) 


If the typical frequency of the system is ws, 0; F' is of order wF. There- 
fore, in order of magnitude, 
10 s 
ZL Fit- r/d ~ F(t- r/o). (11.143) 
r Or cr 


Therefore, the term F(t — r/c)/r? in the right-hand side of eq. (11.139) 
can be neglected with respect to the second term if 


~<—, (11.144) 


and therefore r >> c/ws. Since the frequency w of the radiation emitted 
is also of order ws, we can write this as 


C 
ae 
e (11.145) 
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This defines the far zone, or radiation zone. Under this condition, all 
other contributions of order 1/r? to the electric field can be neglected. 
In particular, we can neglect also the terms O(1/r?) in eq. (11.2) coming 
from the expansion of 1/|x — x’| to higher order, 


eee (5). (11.146) 


x-x| r r 


Upon integration over the source, the last term becomes O(d/r?), where 
d (not to be confused with the dipole moment) is the size of the source. 
This term is negligible, with respect to the 1/r terms that we have keep, 
as long as d/r < 1. As we saw in eq. (11.76), the reduced wavelength 
A is of order (c/v)d, so, A > d (and, for non-relativistic source, A > 
d). Therefore, in the far zone r >> X, the condition r >> d is also 
automatically satisfied, so also the 1/r? from eq. (11.146) are negligible. 

From these estimates we see that the region outside a source of size 
d and typical frequency w,, which emits radiation with a reduced wave- 
length X = c/w ~ c/ws, can be separated into a far region, 


r>, (far region), (11.147) 
and a near region, 
r<x, (near region) . (11.148) 


For non-relativistic sources we have d < A, and one can further intro- 
duce the near outer region, defined by 


d<r<a, (near outer region) , (11.149) 


i.e., that part of the near region which is outside the source. The inter- 
mediate region r ~ X is called the induction region. In the far region, as 
we have seen, the multipole expansion has the form (11.126), (11.127), 
and consists of terms of the form 1/r times functions of retarded time 
t— r/c. In contrast, in the near region the situation is reversed, and 
retardation effects are negligible. This means that, to lowest order, we 
can approximate eq. (11.1) as 


H ~ HO 3, , J” (t, x’) 
A" (t,x) > = fe x axl” (11.150) 
where we have replaced t—|x—x’|/c with t in the first argument of j“.1° 
In the near outer region, where retardation effects are negligible and 
furthermore r > d, we can then expand 1/|x—x’| as in eq. (6.4), or, more 
generally, as in eq. (6.26). For a generic time-dependent source, we then 
recover the multipole expansion for static fields, described in Chapter 6, 
except that the source term j"(x’) is replaced by j"(t,x’), with the 
gauge potentials and electromagnetic fields following the instantaneous 
time behavior of the sources. For instance, 


1 p(t — |x — x'|/c,x’) 
t a2, gos. d 1 ’ 
Cee) Are€o f j |x — x’| 
1 t, x! 
[ae eur). (11.151) 
Arey |x — x’ 


161, Chapter 12 we will see how to im- 
prove systematically over this approxi- 
mation. 
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Performing the same steps as in the multipole expansion of static fields, 
in the near outer region we then get 


| eae 11.152 
r r2 2r3 i , ( 5 ) 


1 É _ midi(t) | ningQiz(t) 


which is the same as eq. (6.20), except that d; and Qij are replaced by 
d,(t) and Qi;(t), and similarly for the higher-order multipoles. In the 
same way, for the vector potential in the near region we get 


_ bo f 3.5 -|x-'|/c,x’) 
A(t, x) = ie fe a! Ix — x’| 
Lo 3H j(t, x’) 
~ — | r —— 11.1 
e j ee (11.153) 


whose expansion, in the near outer region, gives the magnetic dipole 
term plus terms that depend on higher-order magnetic multipoles, 


_ po m(t)xx 


Be) = 4n r3 


4 (11.154) 


which is the same as eq. (6.38) with m replaced by m(t). 

Equations (11.152) and (11.154) are the generalization of eqs. (6.20) 
and (6.38), respectively, from static sources to time-dependent sources in 
their near outer region. In the induction region, the spatial and time de- 
pendence of the fields gradually evolve from that of the near outer region 
to that of the far region, and no comparatively simple approximation to 
eq. (11.1) is possible. 
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Problem 11.1. Decomposition of a vector field into transverse and 
longitudinal parts 


In this Solved Problem we discuss the decomposition of a vector field into its 
transverse and longitudinal parts, also known as the Helmholtz decomposition 
(or Helmholtz theorem). We used this decomposition in Section 11.1.2 in the 
computation of the radiation field performed in the Coulomb gauge, but this 
is, more generally, a useful mathematical tool. 

In eq. (10.118) we saw how to decompose a vector into its longitudinal and 
transverse parts. This decomposition can be generalized to a vector field V(x), 
writing 


V(x) = V(x) + V(x), (11.155) 
where 
Vix) = -V lv. fae wea] ; (11.156) 


Vis) = Vx lv x Ja as] , (11.157) 


4r|x — x'| 


as long as the integral over d*x’, in eqs. (11.156) and (11.157) converges (writ- 
ing d'r’ = r’*dr'dQ we see that this is ensured if V(x) vanishes faster than 
1/|x|? as |x| + 00). Indeed, using eq. (1.7), we see that, in components 


: V(x’) 2 3u Vix’) 
i(x) = 0,0; | Pr —? - d iri 
Viste) aa f rag- f ea (11.158) 
Then, using eq. (1.91), 
V(x’) 
i(x) = V; ið; | da’ DMM. 111 
Vi lx) = Vila) +010; f da’ BOO (11.159) 
On the other hand, in components eq. (11.156) reads 
V(x’) 
(x = 8ð; | Pe 11.1 
Vi (x) aa f x t= (11.160) 


so, eq. (11.155) indeed holds. The usefulness of this decomposition comes from 
the fact that, since V1 is the curl of a vector field and Vy is the gradient of 
a scalar field, they satisfy 


Vxv; = 0, (11.161) 
V-Vi = 0. (11.162) 


Eqs. (11.155)—(11.157) therefore provide the decomposition of the vector field 
V(x) into its curl-free and divergence-less parts. This decomposition is used 
in many contexts in physics, so it can be useful to elaborate on it a bit further. 
First of all, note that Vj (x) and V(x) are non-local functionals of V(x), in 
the sense that their value in x depends on the values of V(x’) for all x’, and 
not only on the value of V(x’) and of a finite number of its derivatives at 
x’ = x. They are indeed given by a convolution with the Green’s function 
(4.15) of the Laplacian. These relations become, however, local in Fourier 
space. In fact, using eq. (1.91), we see that eq. (11.160) implies 


V?Vj a(x) = 8:ð;V; . 11.163) 
Writing 
dk F ik-x 
V(x) = (an) V(kje™, 11.164) 
and similarly for V\(x) and V1 (x), eq. (11.163) becomes 
-° Vi i(k) = —kikyVj(Ik) , 11.165) 


where k? = |k|?. Therefore (for k 4 0), 


~ kik 
Vi yt (k) = k2 


1 V;(k). (11.166) 


Writing V1 (k) = Vi(k) — Vj; (k), it also follows that 


Vi a(k) = (63 z ra) V3 (k) . (11.167) 


In terms of the unit vector k = k/k, eqs. (11.166) and (11.167) read 


Vi (k) =k [kva] , (11.168) 
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17 Equation (11.172) can be shown as 
follows: 


/ 
Vax f da! VO 
4n|x — x’|], 


Vie (x’) 
= 2 3,1 k 
= 64105 fa T 4n|x — x'| 
0] 1 
cut f 7 aa 4n|x — x’| 
0 Í 
a Sy 1 
= erfa a’ Vp (x Lae eT 


= eijk fa! OV: (x!) 1 
=ë 


ðx'İ 4r|x-—x'|’ 


where, in the last line, we integrated by 
part and we observed that, by assump- 
tion, V(x) vanishes faster than 1/|x|? 
as |x| — oo (so that the integral in 
eqs. (11.157) and (11.156) converges), 
and therefore the boundary term van- 
ishes. Similar manipulations can be 
used to prove eq. (11.171). 


18Observe that we could have derived 
directly the decomposition (11.174) 
without passing through eqs. (11.156) 
and (11.157). The procedure is almost 
equivalent, except that the bound- 
ary conditions to be imposed are 
slightly different: the integrals in 
eq. (11.174) converge if both (V-V)(x) 
and (VxV)(x) go to zero faster than 
1/|x|? as |x| — oo. Furthermore, 
the uniqueness theorem discussed in 
Solved Problem (4.1.5), eqs. (4.45)— 
(4.47), ensure that this decomposition 
is unique as long as V(x) goes to zero 
as |x| — oo. Therefore, eq. (11.174) 
is valid if V(x) goes to zero (with any 
speed) as |x| — oo, and (V-V)(x) 
and (VxV)(x) go to zero faster than 
1/|x|?. In contrast, the decomposition 
(11.155)—(11.156) assumes that V(x) 
goes to zero faster than 1/|x|?, which 
is more restrictive. 


19Note that all the results of this de- 
composition are also valid for time- 
dependent fields, simply replacing A(x) 
by A(t, x), since the time variable never 
enter any of these equations. 


and 
Vik) = V(k) - k [ke W()] . (11.169) 
Comparing with eqs. (10.119) and (10.120) we see that V)(k) is the compo- 
nent of V(k) parallel to k, while V1 (k) is the component of V (k) transverse 
to k. Then, V_ (k) is called the transverse part of V(k) (since it is transverse 
to k) and V)(k) is also called the longitudinal part of V(k). Note that Vj (k) 
and V (k) are determined by V (k) with the same value of k, i.e., in wavenum- 
ber space these relations are local. In wavenumber space, eqs. (11.162) and 
(11.161) become 
kx Vi(k)=0, kVi(k)=0, (11.170) 
and the validity of these relations can be immediately checked from the explicit 
form (11.168,11.169). Another useful form of the decomposition is obtained 


observing that!” 
V(x’) Vr V(x") 
3 of = 3 x 
vV fa x inex fa x j- (11.171) 
V(x’) Vx x V(x’) 
3I — Bt x 
Vx x fa T inate = fa x ‘ina (11.172) 
Therefore, introducing the notation 
f(x) = V-V(x), w(x) = Vx V(x), (11.173) 
the decomposition (11.155)—-(11.156) can be rewritten as'® 
= 37 __ F(x’) i 3,7 W(x’) 
V(x) =-V (Ja x — +Vx ( dx erp (11.174) 


Recall that, in Solved Problem 4.1.5, we proved that, in R?, a vector field V(x) 
is determined uniquely by its curl and its divergence (under the assumption 
that it vanishes sufficiently fast at infinity). Equation (11.174) shows explicitly 
how to compute it, in terms of its divergence f(x) and its curl w(x). 
Applying this decomposition to the vector gauge potential A, we can write 


A(x) = A(x) + A(x) 
= A (x)-—- Va, (11.175 
where, using w(x) = VxA(x) = B(x), 
gas B37 B(x’) 
ai= vx fa ae): (11.176 
and 
_ fey (PAM) 
a fa aoe (11.177 


Under a gauge transformation (3.86), A transforms as A > A — V@. Since 
the additional term is a gradient, it is reabsorbed into a transformation of 


a(x), 
A(x) 


a(x) 


> A(x), 
> a(x)+0(x). 


(11.178 
(11.179 


This shows that a(x), and therefore Aj (x), is a pure gauge degree of freedom, 
that can be set to zero with a gauge transformation. In contrast, A(x) is 
gauge invariant. This was clear already from eq. (11.176), where A(x) is 
written in terms of the magnetic field B, which is gauge invariant. "° 


Post-Newtonian expansion 
and radiation reaction 


In this chapter we focus more closely on the dynamics in the near zone, 
r & A. As we have discussed in Section 11.3, in the near zone, in the 
lowest-order approximation, retardation effects can be neglected, and the 
electromagnetic field depends on the instantaneous value of the sources. 
In this limit, we recover the Newtonian dynamics. We now want to un- 
derstand how to systematically improve over this zero-th order approxi- 
mation, performing a systematic expansion for small retardation effects. 
This will give rise to the so-called post-Newtonian (PN) expansion. We 
will then study the back-reaction problem: a system of charges that 
emits electromagnetic radiation at infinity loses energy, and this affects 
the dynamics of the particles in the near zone. This will manifest itself 
through “radiation-reaction” forces in the near region. In this context, 
we will also have to deal with divergences that appear for point charges. 
We will see that a conceptually clean treatment of the problem only 
emerges by using renormalization techniques, typical of quantum field 
theory, despite our purely classical context (we already found a similar 
situation in Section 5.2.2). This chapter treats advanced subjects and 
some parts are very technical, and is meant for advanced readers. 


12.1 Expansion for small retardation 


We first discuss how to set up, in general, the expansion for small retar- 
dation effects, while in the subsequent sections we will explicitly perform 
the computation of the first leading terms. To this purpose, we start 
from eq. (11.1), that we rewrite here separately for the scalar and vector 
potentials, in the form 


1 p(t — |x — x’|/c,x’) 
t = — | dr’ 12.1 
o( ,X) Are J T |x — x'| a ( ) 
ji 1 z E nwl / 
AG. = dg Mea k= xox) aa 


Amey C2 |x — x’| 
As we have seen, these equations are exact, in the sense that they are 
valid for arbitrary velocities of the internal motions of the source, and 
only assume that the source is localized, i.e., that p(t, x) and j(t, x) have 
compact support in x (or, more generally, that they decrease sufficiently 
fast as |x| — oo, so that the integrals converge). We also recall that 
these expressions have been derived fixing the Lorenz gauge, 0,,A" = 0. 
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IThe name “PN expansion” is bor- 
rowed from the corresponding expan- 
sion that is performed in General Rel- 
ativity, where it corresponds to an ex- 
pansion beyond the Newtonian gravi- 
tational potential (see e.g., Chapter 5 
of Maggiore (2007) or Poisson and Will 
(2014) for an introduction). We will use 
this terminology also in the context of 
electrodynamics, to denote, more gen- 
erally, the expansion beyond Newto- 
nian dynamics. Both in electrodynam- 
ics and in gravity, the Newtonian dy- 
namics is obtained in the formal limit 
c — oo and, as we will see, the PN 
expansion corresponds to an expansion 
in powers of (v/c), with v the typ- 
ical velocity of the inner motions of 
the source. Since in electrodynam- 
ics the Newtonian dynamics is gov- 
erned by the Coulomb potential, in this 
case, instead of “post-Newtonian ex- 
pansion,” the name “post-Coulombian 
expansion” is sometimes used, see e.g., 
Chapter 12 of Poisson and Will (2014). 
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The PN expansion is an expansion for small retardation effects, that 
improves systematically on the approximation used in eqs. (11.151) and 
(11.153), where retardation effects were simply neglected. This is ob- 
tained by expanding the charge density and current distribution in eqs. (12.1) 
and (12.2) as 


a, 
plt—|x—x'/ex’) = plt,x’) - X= aotz) 
1 /je—x|\7 

io (==) o? plt, x’) +..., (12.3) 
2 As we will see below, at a given PN and? 
order we actually need to expand the | 1 

t onl to two t less th e . x- x|.. 

“a ° i(t— |x —x'|/c,x’) = j(t, x’) — Bilt) +. (12.4) 


If we denote by ws the typical frequency of the source, each time deriva- 
tive brings a factor of order ws, so eqs. (12.3) and (12.4) are actually an 
expansion in the parameter w.|x —x’|/c. As we have seen in Chapter 11, 
ws is of order of the typical frequency w of the radiation emitted, and 
w/c =1/X. Therefore, the expansion is valid as long as |x — x'| < A. 
On the other hand, x’ is an integration variable that, for a localized 
source, is restricted to values |x’| < d; since A = (c/v)d, we have d < A 
for the non-relativistic sources that we consider in the PN expansion. 
Therefore, the condition |x — x’| < X is equivalent to the condition 
|x| =r «A. In conclusion, the PN expansion is valid in the near zone, 
r <A, and breaks down in the far zone. In particular, the PN approxi- 
mation cannot be used to compute directly the radiation field at infinity 
(although, as we will see, it can be used to compute how the emission 
of radiation at infinity affects the dynamics of the charges in the near 
zone). 

From the above discussion we see that the PN approximation is valid 
in the near region r < A, without any assumption on the relative values 
of r and d. In the outer near region, d & r < X, we could combine the 
PN expansion with a multipole expansion, analogous to the expansion in 
static multipoles of Chapter 6. However, a main application of the PN 
expansion is that it allows us to compute the electromagnetic fields in the 
inner near region, i.e., in the region where the charges are localized, r < 
d «A. In turn, this allows us to compute how a set of charges evolves 
under their mutual interaction, beyond the Newtonian approximation. 
Therefore, in this case no multipole expansion can be performed. Below 
we will examine the general form of the PN expansion, valid also in the 
inner near region. 

In the inner near region, the expansion in eqs. (12.3) and (12.4) ac- 
tually becomes an expansion in powers of v/c. In fact, in the inner 
near region both |x| and |x’| are of order d, so the expansion parameter 
ws|x — x'|/c becomes of order wsd/c ~ v/c. In general, a term sup- 
pressed by a factor (v/c)?", with respect to the Newtonian dynamics, 
is called a term of order nPN, with half-integer PN orders representing 
odd powers of v/c; e.g., the 1PN order gives corrections of order (v/c)? 
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static multipoles 


Fig. 12.1 The different expansion regimes discussed in the text. 


to the Newtonian result, while the 1.5PN term is suppressed by (v/c)?. 
This notation, which is standard in the context of the PN expansion, is 
also useful because, as we will see below, “half-integer” PN orders are 
associated with radiation reaction, and this notation allows us to single 
them out more clearly. 

In the previous chapters we have examined several different approxi- 
mations to eqs. (12.1) and (12.2) and, to put the PN expansion in the 
correct perspective, it is useful to recall and compare them. It is conve- 
nient to display the different regimes in the plane (v/c, r), as in Fig. 12.1, 
where v is the typical velocity of the internal motions of the source, and 
r is the distance at which we compute the electromagnetic field. In this 
plane, an important region is determined by the condition r ~ X. As we 
saw in eq. (11.76), A is of order (c/v)d, so this corresponds to r œ (c/v)d. 
This region is shown in Fig. 12.1 as a shaded thick curve, which diverges 
for v/c — 0 and decreases as v/c increases, until it reaches the value 
r ~ d when v/c > 1. The part of the plot well above this curve is the 
far region (or far zone) r >> A, while that well below it is the near region 
(or near zone). The shaded part r ~ X is the induction region, where 
the transition between the near and the far regions takes place. We have 
further marked, on the vertical axis, the value r = d, that separates the 
inner near region from the outer near region. 

In Chapter 6 we considered static sources, and we performed an ex- 
pansion for r >> d. Within the plot in Fig. 12.1, a perfectly static 
source corresponds to v/c = 0 and therefore to the vertical axis. In 
this case, the “near” zone extends all the way up to spatial infinity and, 
for r >> d, the appropriate tool is the expansion in static multipoles, 
as in eqs. (6.20) and (6.30). Clearly, the assumption of exactly static 
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3Once extracted from the integral, the 
time derivative in eq. (12.7) becomes a 
total time derivative d/dt given that, in 
this case, the integral is independent of 
x, while for the 1PN and higher-order 
terms it remains a partial time deriva- 
tive ôr. 


sources is an idealization. A system of interacting charges will undergo 
accelerations because of their mutual interactions and will have non- 
zero velocities. If we denote by ws the typical order of magnitude of the 
frequencies of such motions, this system will generate electromagnetic 
waves with typical frequency w ~ ws (as we have seen, with numerical 
coefficients of order one that depend on the multipole involved), and the 
corresponding value of A is equal to c/w ~ c/ws, and is finite, so the 
expansion in static multipoles is actually only valid ford <r < c/ws. 

For generic, non-zero values of v/c, at r >> A, i.e., above the shaded 
curve in Fig. 12.1, we are in the far zone, and retardation effects are 
crucial: as we have seen in Section 11.1, they are responsible for the 
presence of terms in the electric and magnetic fields that decrease only 
as 1/r, i.e., of the radiation field. The appropriate treatment here is 
the one developed in Section 11.1 for sources with arbitrary velocity. 
If, furthermore, v/c < 1, we can perform an expansion in radiative 
multipoles, as we discussed in Section 11.2. 

In the part of the plot below the shaded curve we are in the near 
zone, r K A. On the vertical axis, at v/c = 0, we have the Newtonian 
dynamics. As we have seen, if furthermore r >> d we can expand in 
static multipoles, while for r < d we must deal with the exact New- 
tonian dynamics. As we move away from the vertical axis, while still 
staying in the near zone, as long as v/c < 1 retardation effects can be 
included perturbatively, using the PN expansion that we will discuss in 
this chapter. If, furthermore, r >> d, we can combine the PN expansion 
with a multipole expansion. 
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After these general considerations, we are ready to perform the explicit 
computation of the first non-trivial corrections to the Newtonian dy- 
namics of a system of point charges, which, as we will see, correspond 
to the IPN order. For the scalar potential, we insert eq. (12.3) into 
eq. (12.1). This gives 


p(t, x) = on(t, x) ar do.5Pn(t, x) zg dipn(t, x) Peas ’ (12.5) 
where® 
1 
on(t,x) = 1 pores. (12.6) 
Ate |x — x’| 
1 id 
dospn(t,x) = — —— | Ër p(t,x’), (12.7) 
4reg cdt 
= 1 A 2 Bl / oal 
dipn(t,x) = Tg 2a O; fe wv plt, x’ |x- x|. (12.8) 


The term ¢y(t,x) is just the instantaneous Coulomb potential that de- 
scribes the Newtonian dynamics, with the potential ġy at time t deter- 
mined by the charge density at the same value of t, while ¢o,5pn(t, x) 


and ¢ pyn(t,x) are 0.5PN correction and the 1PN correction, respec- 
tively. Observe that, in practice, the order at which a given term enters 
the PN expansion can be read simply from the powers of 1/c in front 
of it. So, for, instance, dipn(t,x) in eq. (12.8) has an explicit factor 
1/c?. In any computation, such as in the equations of motion a system 
of point particles that we will do below, the required powers of v will 
appear automatically for dimensional reasons, so this term will give a 
correction of order v?/c? to the Newtonian result. 

We next observe that ¢o.5pn(t,x) actually vanishes because, for a 
localized system of charges, the total charge 


Q= [PE (12.9) 


is conserved, so dQ/dt = 0. Therefore, there is no 0.5PN correction to 
the scalar potential. 

For the vector potential we can perform the same expansion, plugging 
eq. (12.4) into eq. (12.2). However, because of the explicit 1/c? term 
in eq. (12.2), the expansion at 1PN order is obtained using simply the 
lowest-order term in eq. (12.4), 

: 1 
Aipn(t,x) = eee ix’) (12.10) 


~ Amey @ |x — x’| 


We now specialize to a system of point charges. In this case the charge 
density and current distributions are given by 


N 

p(t,x) = > ad fx- xalt)], (12.11) 
Fa 

jt, x) = X qava(t)5® [x — xa(t)]. (12.12) 


The action of a single charged point particle interacting with an ex- 
ternal electromagnetic field was already given in eq. (8.70). We find 
it useful to write the interaction term in the form (8.74), as an inte- 
gral over dt and d?x; we will indeed see below that some aspects of the 
computation are clearer if we deal with the fields #(t,x) and A(t, x) at 
a generic space-time point, rather than with the fields ¢[t,x.,(t)] and 
Alt, x.(t)] evaluated on the particle trajectory. Furthermore, we recall 
that eq. (8.74) gives the interaction of a given charge and current den- 
sity with an external electromagnetic field. To stress this, we rewrite 
eq. (8.74) in the notation 


Sint = fuds [—p(t, x) @ext(t, x) + j(t, x)-Aext(t,x)] - (12.13) 


When we are interested in the interaction between a set of point charges, 
it is natural to assume that the a-th charge is subject to the potentials 
generated by all other charges, except the a-th charge itself. Therefore, 
we will exclude self-interaction terms, analogous to the self-energy terms 
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discussed and subtracted in Section 5.2.2. It should be acknowledged 
that this is a somewhat “ad hoc” prescription. In Section 12.3.2 we will 
come back to these self-force terms, and we will give a deeper discussion 
of why, in the present computation, in which we work to 1PN order, 
they can indeed be discarded. Then, for a system of point charges, the 
action (8.70) becomes 


S= Stree F Sint , (12.14) 
where x 
_ 2 J _ valt) 
Stree = “Dae fa ls T2 ; (12.15) 
and 


N 
1 r 
Sint = 5 5 J dtd? x [—pa(t, x) da,ext (t, x) + ja(t, x)-Aa ext (t, x)|] ; 
a=1 
(12.16) 
where pa, Ja are the charge and current density of the a-th particle, 
dad) [x — xa (t)], (12.17) 
ja(t,X) = qaVa(t)d) [x — xa(t)], (12.18) 


k= 
S 
+ 
* 
l 


while Qa ext; Aa,ext are the gauge potentials generated by all other par- 
ticles, except the a-th particle itself, which therefore are seen as “exter- 
nal” gauge potential by the a-th charge. So, for instance, the function 
P1,ext(t, X) is the scalar potential generated, at the point x and at time 
t, by the particles 2,..., N, which have positions x2(t),...,xw(t); the 
function $2,ext(t, x) is the potential generated by the particles 1,3,..., N, 
with positions x;(t), x3(#), ...,xw(t), and so on. Note that the time 
dependence of Qa ext (t,x) and Ag ext (t,x) enters through the time de- 
pendence of the position of the other particles x,(t), with b Æ a, that 
generate the potential felt by the particle a. Similarly to eq. (5.9), the 
overall factor 1/2 in front of eq. (12.16) compensates for the double 
counting of the particles pairs. 

It is useful at this point to recall that the interaction action (12.13) 
is invariant under the gauge transformation (3.86), that, in the present 
notation, reads 


Aext(t,x) > Acxt(t,x) -— VO, (12.19) 
old 
Pext(t, x) > Pent (t,x) + a - (12.20) 


As we saw in eq. (8.75), this is a consequence of the conservation equation 
ð j” = 0, or equivalently, of eq. (3.22). Indeed, under the transforma- 
tion (12.19), (12.20), the interaction action (12.13) changes as 


Sint > Sm + | atx eeo -i(t.9)-6] 


= Sm + f dtd 6(t,x) [Zee + vt], (12.21) 


where we integrated by parts discarding the boundary term, using the 
fact that the system of charges is localized in space. The term in bracket 
then vanishes, because of eq. (3.22). We now observe that, in eq. (12.16), 
we can perform a gauge transformation independently on each term of 
the sum, i.e., for each (Qa ext, Aa,ext)- Indeed, if we transform 


A aext =a Aaext zE Voa ’ 


oba 
a,ex t, “a, ) 
Parent (t,x) at 
with an independent gauge function @, for each a, the a-th term in the 
sum in eq. (12.16) changes as 


(12.22) 


=> Qapext(t, X) + (12.23) 


fuds [—pa(t, X)@a,ext (t, X) + ja(t,x)-Aa oxt(t, x)] 
—_> fuds [—Pa(t, X)a,ext (t, xX) + ja(t, X) Aa ext (t, x)] 


o 
+ | tas ba(t, x) [Zet + V-ja(t,x)| ,„ (12.24) 
and for each particle separately the term in bracket vanishes, as we saw 
in eq. (3.30).4 We will make use of this extended gauge freedom below. 


12.2.1 The gauge potentials to 1PN order 


We can now compute explicitly the expression for the gauge potential 
to 1PN order. We consider first the scalar potential ģa,ext(t, x). For 
the Newtonian term, using eq. (12.6), and eq. (12.11) with the self-term 
excluded, we get 


Lf ga! IET 
ae fe XO 5 [x — x lt)] ey 


b#a 


Pa ext (t, x)|y = 


1 do 
12.25 
Are 2 |x — x(t) ( ) 


which, of course, is just the Newtonian potential already computed in 
eq. (5.8), except that now the particles that generate this potential have 
coordinates x(t), functions of time. 

For the 1PN term, from eq. (12.8),° 


1 1 
bacali hen = Gm zt | Pa! ot x- x! 
1 1 
= Fog ad f bas K — xl- x 
ba 


1 


1 
S Fad, WOK — xo(t)]. (12.26) 
ba 


Therefore, to 1PN order, in the Lorenz gauge in which we are working, 


1 1 1 
a ext (t, AEEA 
da,ext (t, X) Aree dt Paar + 92 


0? |x — x (t)| (12.27) 
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“The implicit assumption is that we 
have a set of charges that interact 
among them electromagnetically, but 
retain their individuality; for instance, 
they do not merge together, and they 
do not decay into other charged par- 
ticles, because of electromagnetic or 
other interactions. 


> Observe that, even if, eventually, in 
eq. (12.16) a,ext(t, x) is multiplied by 
Pa(t,x), which is given by eq. (12.17) 
and therefore is proportional to 6(3) [x— 
Xa(t)], still, in eq. (12.26) we cannot yet 
replace x by Xq(t). In eq. (12.26), the 
0? operator acts only on x,(t) and not 
on x, since ¢a,ext (t,x) is the potential 
at a generic point x in space, with a 
time-dependence due to the fact that 
it is generated by a set of charges qb, 
with b Æ a, on time-dependent trajec- 
tories x(t). If one would replace x = 
Xa(t) in eq. (12.26) before taking the 
time derivatives, one would make a mis- 
take, introducing spurious time deriva- 
tives of the function xq(t). Similar con- 
siderations hold for the gauge trans- 
formations that we will perform below 
on da,ext (t,x) and Ag ext(t,x), with a 
gauge function 6q(t,x) that will also 
depend on t and x through |x — x,(t)|, 
see eq. (12.29). When we take the time 
derivative and spatial gradient of such 
a function, we must make clear the dis- 
tinction between the x dependence and 
the t dependence, which is not possible 
if we replace from the start x by xa(t). 
For these reasons, it is necessary to 
perform the computation starting from 
the expression (12.16) of the interac- 
tion action, without yet performing the 
integral over dx with the use of the 
delta functions 5) [x — xa(t)] which is 
present in pa(t,x) and ja(t,x). 
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Similarly, plugging eq. (12.12) into eq. (12.10), and discarding again the 
term with b = a, we find, cae to 1PN order, 


1 T 
Aacxtlt, x) = e 2 amt 4 Br! 5) [x — x(t) koxi 
dova (t 
= 12.28 
ina Ce > 2 = xp(t ( ) 


If now one just proceeded i computing explicitly the second time 
derivative of |x — x,(t)| in eq. (12.27), one would find terms that depend 
on the accelerations d?x,/dt?, and therefore the corresponding interac- 
tion action (12.16) would also appear to depend on the accelerations of 
the particles, leading to equations of motion involving time derivatives of 
order higher than the second. As we will discuss in Section 12.2.3, such 
a dependence on higher-order derivatives is in principle unavoidable at 
higher orders in the PN expansion; however, it can be eliminated, or- 
der by order in the PN expansion, expressing these higher derivatives in 
terms of the positions and velocities, by using the equations of motion to 
lower orders. While this procedure becomes necessary to higher orders 
of the PN expansion, at the 1PN order at which we are working it is 
much simpler to observe that these higher-derivative terms can be elim- 
inated with a gauge transformation of the form (12.22, 12.23), choosing 
as gauge function 


ba(t, x) =—— 5 q0 |x a xp(t)| (12.29) 


972 
Ate 2c ba 


This function is chosen so as to eliminate the term involving 0? in 
eq. (12.27), since, under this gauge transformation, 


Pa (t, x) > Palt, x) F oba 
— 1 1 2 
= baext(t, x) — Rees, 22 2 qð; |X — xe(t)|. (12.30) 


Therefore, in the new gauge, 


a,ext (t 12.31 
ọ i(t, x) = z - mee, ( ) 


which is just the Newtonian eat In other words, this gauge is 
chosen so that da,ext(t,X)|1pN = 0, which has been possible because the 
term Qa ext(t, X)|ıpn in eq. (12.26) is a total time derivative. Under this 
gauge transformation Aa,ext(t, X) picks an extra term, 


Aaext (t, x) = Ag „ext (t, x) _ VOa(t, x) (12.32) 
1 1 vp (t) 1 
ire got ia 5 OV |x xp(t)| 


We next use 
(12.33) 


and® 
x—x,(t) |] _ v(t) xX — x(t) a y 
a ERDI OT E EE EHO OOD 
(12.36) 
So, in the new gauge, 
A aext (t, x) i J2 oe db = ae = x(t 7 (12.37) 


- x(t 


E 2c? Ss xp(t)] -vo(t)} . 


From this expression, we can verify explicitly that 


Ix — x(t) 


V-Aaext (t, x) =0 ; (12.38) 


and therefore we have simply reached the Coulomb gauge (3.92) for each 
of the Agext, at LPN order. So, we started from the general expression 
for the gauge potentials in the Lorenz gauge, where eqs. (12.1) and (12.2) 
hold; we have computed them explicitly to 1PN order, and we have then 
performed a gauge transformation that puts these expressions for the 
gauge potentials in the Coulomb gauge. Note that, since V-Ag ext = 0 
but O:¢a.ext # 0, these potentials no longer satisfy the Lorenz gauge 
condition 0,,A# = 0. Equivalently, one could have performed the PN 
expansion working from the start in the Coulomb gauge (which can be 
done to all orders in the PN expansion), using eq. (3.93) for @a,ext, 
sourced by pPa,ext = ren pv. This immediately gives eq. (12.31), in 
fact to all PN orders. We can then insert this expression for ¢¢,ex¢ into 
eq. (3.94) and solve it in an expansion for small retardation effect. To 
IPN, only the lowest-order term of this expansion is needed, and gives 
eq. (12.37). 


12.2.2 Effective dynamics of a system of point 
charges 


Equations (12.31) and (12.37) allow us to eliminate the gauge fields 
in terms of the variables x,(t) and va(t) which describe the charged 
particles, up to 1PN order. This process can in principle be carried out 
to all orders in the PN expansion. In this way, in the near region the 
coupled dynamics of the electromagnetic field and the charged particles 
can be expressed entirely in terms of the degrees of freedom xq(t), Va(t) 
describing the charged particles, with all these variables evaluated at the 
same value t of time, as in Newtonian mechanics. 

The elimination of ¢(t,x) and A(t, x) in favor of x,(t), Va(t) could in 
principle be performed either at the level of the action or at the level of 
the equations of motion and, naively, one might think that the two pro- 
cedures should be equivalent. Further reflection, however, shows that, 
at least at a generic order of the PN expansion, this cannot be the case. 
If we perform the elimination at the level of the action, starting from 
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©The explicit computations go as fol- 


lows. Equation (12.33) is obtained 


writing 

_ Vix xp(t)|? 

xp(t)| 2\x — x (t)| 

_ V [xx — 2x-xy(t) + x, (t)-xy(t)] 
2|x — xp(t)| 


V\x 


_ x- x(t) 
|x — xp(t)| 
while eq. (12.36) follows from 


, (12.34) 


-N E 
a i — xO) 
x — Xp(t) 
ae a-l, 


and last terms is computed writing 
alx — xp (t)/? 
2|x — xp(t)| 
_ Ot [Xx — 2x-xp(t) + xo (t)-xa(t)] 
2|x — xp (t)| 
[x — xp(t)] -vo(t) 
Ix—x,(4)| | 


dex — xo(t)| = 


(12.35) 
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TThe proof is a standard result from 
classical mechanics: given a system 
with generalized coordinates q;(t) and 
Lagrangian L(qi, qi) (with no explicit 
time dependence), the conjugate mo- 
mentum is p; = 6L/dq; and the Hamil- 
tonian is H = p;q; — L, while the equa- 
tion of motion is 
d ôL L 
=0 


dt dG; bg 
In terms of p;, this means that p; = 
6L/dq;. Then 


dt = Pidi T Piqqi dt 
_ ôL ae dL 
Bai qi T Pit at 
However 
d . ÔL di ôL dqi 
—Lai,d) = A ee 
dt ôqi dt ôdi dt 
ôL. 7 
= Wat pin, 
qi 


and therefore dH/dt = 0. 


eq. (12.16), we eventually end up with a dynamics described by an ac- 
tion, and therefore a Lagrangian, that depends only on the position and 
velocities of the particles [we will see in Section 12.2.3 how higher-order 
derivatives, that in principle can also appear at a generic order of the 
PN expansion, can be re-expressed order by order in the expansion in 
terms of the positions and velocities x,(t) and va(t)]. Such a Lagrangian 
has no explicit time dependence and therefore the corresponding Hamil- 
tonian is automatically conserved on the solutions of the equations of 
motion.’ Therefore, a Lagrangian description cannot catch dissipative 
terms, i.e., terms that describe the decrease of the energy of the system. 
However, we know that, beyond some PN order, dissipative terms must 
be present, to account for the energy lost by the system of charges to 
electromagnetic radiation. To obtain them, we cannot work at the level 
of the Lagrangian, but we must rather work at the level of the equations 
of motion, starting from the Lorentz force equation (8.83), which is the 
equation of motion derived from the full Lagrangian (8.71), and elim- 
inate the electric and magnetic fields from it, using their expressions 
obtained from the solution of the gauge potentials in terms of x,(t), 
Va(t), order by order in the PN expansion. This procedure produces the 
full answer, including both conservative and dissipative terms. 

In this section we explicitly carry out the elimination of the gauge 
potentials at 1PN order, both at the level of the equations of motion 
and at the level of the Lagrangian, and we will find that, to this order, 
the two procedures are equivalent, and that the dynamics is conservative. 
As we will see in Section 12.3.3, dissipative terms start from 1.5PN order. 
Beyond 1PN order, one must therefore work at the level of equations of 
motion, in order to obtain the full result including the conservative and 
the dissipative terms. 


Effective dynamics from the Lorentz force equation 


We begin by performing the explicit computation using the equations of 
motion (8.83). We observe that 


(v x B); 


II 


[v x (VxA)]; 
€ijkEklmvjAAm 
= v,0,A; = 0,0; A; š 


(12.39) 


where we used the identity (1.7). Then, the Lorentz force equation (8.83) 
can be written as 


—(YaMaVa) = (12.40) 


dt 
Aa, ox 
(~u V xPa,ext — da ila 


at ley daVa,j VA! ect ma tatdjAnnt) ; 


x=xa (t) 


where Ya = Y(va) and we have stressed that, in the force acting on the a- 
th particle, the electric and magnetic fields are evaluated on the position 
of the particle, i.e., for x = x,(t) (and we recall that spatial indices can 


be equivalently written as upper or lower indices). Using 


J t 
Talina] = Asaba], S40) [Aaea] 
T x=xa (t) 


dt Ot dt 
= PAaext(t,X) + viðjAa ext (t, x) ’ (12.41) 
ot x=x, (t) 


we see that eq. (12.40) can be rewritten as 


d 
dt {YaMaVa F daAa,ext [t, X,(t)]} 
= [-aaVxdasext(ts x) + gata, i Vx, one (ts x)| 1242) 
‘ x=x,(¢ 
For consistency with the fact that we are working to 1PN order, we also 
expand the left-hand side as in eq. (12.67) keeping only the correction 
v2 /c?, so we write 


ad ff, va + qa Aa ext lt, Xa (t)] 
T t 52 MaVa da a,ext [t, Xa 


= |-qa V xpa ext (t,x) + qava,j Vx, one (t, x)| p (1243) 
i x=x,(t 
On the right-hand side, ¢¢,ext and Al aot are given by eqs. (12.31) and 
(12.37), respectively. After having computed their gradients with respect 
to the variable x [using eq. (12.33)], we finally set x = x,(t). 
It is now convenient to use the notation 


1 db 
Xy) = 12.44 
ba(X1, ,Xy) Are 3 |xa(t) — x (t)| 2 ( ) 
and 
1 1 vo(t) 
ee ee = 12.4 
Aa (xı, XN; V1, vy) AT Eg 2¢2 2 do a (t) = x (t)| ( 5) 


1 1 va elf) = ant {[xa(t) — xo(t)] -va(t)} , 
b#a 


which automatically takes into account that the gauge potentials are 
evaluated at x = x,(t) and treats them symmetrically with respect to 
the positions (and velocities) of all particles. We also introduce the 
notation 


rablt) = Xa(t) — x(t), (12.46) 
and rap = |Krabl, Fab/Tab, SO that 
pali... XN) = S oat (12.47) 
ba 
and 
AalX1,--., XN; Vigas VN) = ee 5 “b [vo + ab (Pab Vo )| 
A aaa 4Teo 2c? tae Tab 


(12.48) 
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Then, eq. (12.43) reads 


d v2 
T (2 + 5) MaVa + qaAalX1,..-, XN; V1,- VN)| = —Vaba 


N E ` 7 : : 
7 > qaqb ValPab' Vo) + Ve (ab Va) — Fab [(Va-Ve) + 3(Fav' Va) (ab: Vo)] 
A 8Teoc? f 


ba 
(12.49) 


Observe that the quantity inside d/dt on the left-hand side is just the 
conjugate momentum (expanded to 1PN order), compare with eq. (8.78). 
More explicitly, eq. (12.49) reads 


a Va 1 1 dadb 
1428 | mava 4 TE 
dt ( T 5) MaN Arey 2c? > Tab [vo + Pan (av:vo)] 


1 N 
= Sadb (12.50) 


Va(fab: Vo) + Vo (fab Va) = fap [(Va'Vo) + 3(Pav'Va)(Lav-Vo)| 
2c? 


These are the 1PN equations of motion for a set of charged particles. 


The Darwin Lagrangian 


We now perform the elimination of the gauge fields in terms of x,(t), 
v,(t) at the level of the action, again at 1PN level. This is done inserting 
eqs. (12.31) and (12.37) into eq. (12.16), and carrying out the integrals 
over d°x with the help of the Dirac deltas present in pa and ja. The 
result is 


it ee dad Va'Vp + (Pab Va) (Pab: Vo) 
Goa fu adb | ie a’Vb ab’ Va ab’ Vb 
int 8TEo 2 3 Tab 2c2 
(12.51) 
Since we are working to 1PN order, i.e., corrections up to order (v/c)? 


to the Newtonian result, we also expand the square-root in eq. (12.15) 
keeping only the (v/c)? correction to the Newtonian kinetic energy, 


2 1 1 
-m41 2 = me + Mata H 5Ma$ ae (12.52) 


As far as we are interested in using the Lagrangian to derive the equa- 
tions of motion, the term —m,c? is a constant, and can be dropped (of 
course, if we rather want to compute the Hamiltonian and therefore the 
energy, this gives the rest energy of the particle). Then, summing up, 
we get the Lagrangian of a system of point charges, up to 1PN order, 


L= belie, (12.53) 


where the Newtonian term is 


Sai o 1 a 
Ly = 2 gala — Bre LR ! (12.54) 


and the 1PN correction is 


= Mavt 1 1 N qaq 
L = AELA X X 2 Va'Vp + (Ĉap Va) (fap: Vvo). 
1PN 2 82 Anrep Ac2 E Tab [ a’Yb ( ab a)( ab b)] 


(12.55) 


The total Lagrangian up to 1PN order is called the Darwin Lagrangian.® 
From eqs. (12.44) and (12.48), we see that we can also rewrite the result 


as 
N 


1 1 
Ly=)_ Fae — 5aGa(X1+++5%N)] > (12.56) 
1 


N 4 N 
Mav, 1 
Lipn = ` Ta qa Va'Aa(Xı1,---; XN; V1,---VN)- (12.57) 
a=1 a= 


Note that, in the Coulomb gauge that we are using, a is a purely 
Newtonian term, with no 1PN correction, while A, is a purely 1PN 
term. 

We can now check that eq. (12.49) is indeed the equation of motion 
derived from the Darwin Lagrangian. To keep the notation simple, we 
perform the computation explicitly limiting ourselves to the case N = 2. 
Then, 


1 1 ag 
ie = = 2 A 12.58 
N gM + g M202 AT EQ T12 i ( ) 
Iipn = gami T m2v3) 
1 1 qg 


Ane I2 se [vi v2 + (f12°V1)(fi2-v2)| . (12.59) 


We now compute the equation of motion, for instance taking the vari- 
ation with respect to the variables of the first particle, separating the 
Newtonian and the 1PN terms, 


d OLN OLN d 6Lipn OL1pN 
= 0). 12.60 
É & ) Oxy | ii É ( v1 0x4 ( ) 
In the case of two particles we use the simpler notation rı2 = rn. The 
variations with respect to x; are computed using 


Seo, (12.61) 
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and (as in eq. (11.8) with x replaced by xı — x2) 
On; 1 


dt 
Ox, Tr 


The Newtonian term gives the standard result, 


d [Ly OLN nq nh 
— — pas 12.63 
dt ( Ovi ) OX dt (mawi) ÅTEo r? , ( ) 
while 
d /&Lipn d mv? 1 1 qQ AA 
= | . 12.64 
dt ( hal ) rat 2c? vi Areo 2 r [Varala a 126A) 
and 
ôLipN 1 qa 1 
= 2142 > 12.65 
ôXı Ateg r? 2c? ( ) 


x {vi (û-v2) + V2(n-v1) —n [(vi-v2) + 3(1-v1)(1-v2)]} . 


Plugging these expressions into eq. (12.60), we recover eq. (12.50) for 
the case a = 1 and N = 2. The computation for generic N is analogous. 

This shows that the equations of motion at 1PN order, given in 
eq. (12.50) and obtained by eliminating from the Lorentz force equation 
the electric and magnetic field in terms of the variables x,(t), va(t) that 
describe the charges, can also be derived from a Lagrangian (obtained by 
eliminating the gauge fields at the level of the action describing the cou- 
pled dynamics of the charges and the electromagnetic field). Therefore, 
the corresponding 1PN Hamiltonian, whose explicit form we will com- 
pute below, is conserved on the solution of the equations of motion, see 
Note 7 on page 304, and the dynamics up to 1PN order is conservative. 


The Hamiltonian to 1PN order 


To complete the discussion of the dynamics at 1PN order, we compute 
the corresponding Hamiltonian. To this purpose, we first compute the 
conjugate momentum P,, that we already introduced in eq. (8.78) for a 
generic vector potential. In the present case, from eqs. (12.53)—(12.55), 


ôL 


a 12.66 
OVa ( ) 
EA 1 1 Š ug 
= 1 a aVa | a A A , l 
( + 2c )m Va Ameo 2c? 3 T [Vo + fab (fab vs) 


Comparing with eq. (12.48), we see that this is just the same as 


2 
P, = (1 F 55) MaVa + qa Aa ext lt, Xat); (12.67) 
C 


in agreement with eq. (8.78) (with p = ymv expanded to second order 
in v/c). The Hamiltonian is then computed from 


N 
Hay Peva-L, (12.68) 
a=1 


where va is written in terms of P, inverting eq. (12.66). To the 1PN 
order at which we are working, this can be done writing eq. (12.66) as 


P v2 ees es dad 
a atb & x 
Va CET ` Vp + ĉablfab' Vvo), (12.69 
Ma 2c? Areo 2c? ta, Malad [vo + Fan (Pab-vo)] ( ) 


and substituting, in terms proportional to 1/c”, the lowest-order relation 
Va = Pa/ma, since the terms neglected are of higher order. This gives 


P, P qaqb Spr 
gS < Py > [P a ai -P A 
È nm ma b + Fao (Pav Ps )] 


Ma  2m3c? 4 MaMbfab 
(12.70) 
Plugging this into eqs. (12.68) and (12.55) we get 
N N N 
P P? 1 qaqb 
H = a (1 a | 12.71 
= 2Ma ( i STE pape Tab ( ) 
a=1 a=1 bža 
_ dalb 5 a 
[Pa P ab'Pa)(ĉab:Po)] - 
A 22 Sor MaMblab b + (Fab-Pa)(Fao-Ps)| 
In particular, for a two-body system, 
H = Ji 1 A H iz 1 P 
2m4 Am? c? 2mMo Amsc? 
1 P-P ĉ-P4)(ĉ-P 
192 |} iP. + (f-Pi)(#-P2) (12.72) 
4neg r 2M mc? 


where r = rj. An alternative derivation is obtained using eq. (8.87), 
that, for a single particle in an external potential, we rewrite in the form 


P—qgA\* 


where the potentials 6 and A must be computed on the position x(t) of 
the particle. Expanding the square root to second order? 


(P—qA)° (P—qA)’ 


H =m 
meny 2m 8m3 c? 


.+ q0. (12.74) 


The first term is the energy associated with the rest-mass, which, in a 
non-relativistic context, we subtract. To 1PN order, the Hamiltonian 
of the a-th particle, in the fields da ext, Aa,ext generated by the other 
particles, is therefore 


P? P 
2Ma 8m3 c2 


Ha = + da@a, ext — mPa ‘Aa ext > (12.75) 
where we subtracted the rest mass contribution, and we took into ac- 
count the fact that the vector potential Ag ext generated by the non- 
relativistic charges q, (with b 4 a) is proportional 1/c?, see eq. (12.37) 


12.2 Dynamics to order (v/c)? 309 


With 


(P - 4A) = 


the 


[(P - 


obvious 


qA)?}. 


notation 


310 Post-Newtonian expansion and radiation reaction 


and therefore, to 1PN order, it contributes only through the cross prod- 
uct with P, in the expansion of the quadratic term. To this order, the 
total Hamiltonian is therefore 


N 
P2 P4 il q 
H= a a t aVa,ext —_ —— P,:Ag exi 3 
5 | g1 Qa,ext 7 „ext 


2Ma 8m 2m 


(12.76) 
where, as before, the factors of 1/2 compensate the double counting 
of the interaction term: otherwise, e.g., the term in the Hamiltonian 
proportional to q1q2/rı2 would be counted once when we compute the 
energy of particle 1 in the potential generated by particle 2, and once 
when we compute the energy of particle 2 in the potential generated 
by particle 1. In this expression, the potentials Qa ext and Aa,ext must 
be evaluated on the position of the a-th particle, i.e., they are given, 
more precisely, by ¢,[t,Xa(t)] and Ag ext[t,Xa(t)]. Reading them from 
eqs. (12.31) and (12.37), with x replaced by x,(t), and using, at this 
order, Va = Pa/ma, Vp = Po/mp, we get back eq. (12.71). 


12.2.3 Reduction of order of the equations of 
motion 


As we discussed after eq. (12.28), in general, in the PN expansion, we 
find terms with derivatives higher than the second in the Lagrangian. 
In the Lorenz gauge, we see from eqs. (12.1)—(12.4) that, at nPN or- 
der, ¢ contains times derivatives of the positions of the particles up to 
0?” and A starts to contribute from 1PN and, at nPN order, contains 
times derivatives up to 07”~*. At the 1PN level, we were able to elim- 
inate the higher-order derivatives with a gauge transformation to the 
Coulomb gauge. However, this is no longer possible at higher orders. To 
eliminate a term proportional to 0?” in ¢ with a gauge transformation 
$ — + (1/c)d,0, we need terms proportional to 0?"~' in 0. Then, the 
same gauge transformation applied to the vector potential, A > A-—V690, 
induces terms proportional to 0?"~* in A (that adds up to the derivatives 
up to 02"~? that were already present in A before the gauge transfor- 
mation), so, even if we can eliminate altogether higher-order derivatives 
from ¢ choosing the Coulomb gauge, beyond 1PN order, second- and 
higher-order derivatives of xq(t) remain in A, and therefore in the La- 
grangian. Correspondingly, the equations of motion will be of order 
higher than second, involving third time derivatives of the position, as 
well as higher and higher time derivatives as we increase the PN order. 
At first one might think that this is very problematic, since the evo- 
lution would no longer be determined by giving, as initial conditions, 
the initial positions and velocities of the particles; even worse, at each 
new order of the PN expansion we would need more and more initial 
conditions, involving higher and higher time derivatives of the position. 
Furthermore, higher-order differential equations are typically plagued by 


all kind of instabilities. Actually, this catastrophe is only apparent, and 
the equations of motion can be systematically reduced to second-order 
equations, at each order of the PN expansion. As a schematic example 
of this “reduction-of-order” procedure, consider an equation of the form 


i= fold + Zhladt +0(4), (12.77) 


for some dynamical variable q(t). Taking a time derivative of this equa- 
tion gives 


i= inet igneo) 027) 


We can then use this to replace q on the right-hand side of eq. (12.77), 
which gives 
1 d ld 1 
ee ago a. dd. yaro(s 
Sheaf Ehei + ag Aei) il+ (=)} 


However, to order 1/c?, this is the same as 


1 d 1 
i= fad + ZhadZhad+O(), a20 


in which, on the right-hand side, only derivatives up to ¢ enter. So, in 
practice, to order 1/c?, we can simply use the zeroth-order equation of 
motion g = fo(q,q) to compute the term @ in eq. (12.77), since the 
latter is already multiplied by 1/c?. Observe that @ also appears on the 
right-hand side of eq. (12.80), since 


(12.81) 
Then, solving for g, the equation can then be re-arranged in the form 

7 : 1 ; 1 

ï= fola.d) + zalai +O\ a), (12.82) 
with a different functions gı(q, q), given explicitly by 


nad) = lad) [ad E + Pa (12.83) 
This kind of procedure can be carried out order by order, keeping the 
equations of motion always of second order. 

As a related application of this reduction-of-order procedure, observe, 
from eq. (12.48), that Ag ext|t, Xa(t)] depends on v(t), for all b # a. 
Then, on the left-hand side of eq. (12.49) appear also the accelerations 
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This section is very advanced and 
should definitely be skipped at first 
reading. 


dv,/dt of all particles, so it might look that, even if the equations are just 
of second order, still the equation of motion of the a-th particle involves 
the acceleration of all other particles, giving rise to a very complicated 
coupled set of equations. However, since, in eq. (12.49), dv/dt (with 
b # a) only appears in a term of order 1/c?, it can be replaced by its 
expression computed using the equations of motion for the b-th particle 
at Newtonian order, so as a function of positions only. So, in the end, the 
acceleration of the a-th particle depends on the position and velocities 
of all other particles, but not on their accelerations. 


12.3 Self-force and radiation reaction 


In the previous section, when computing the conservative dynamics of a 
system of point particles up to 1PN order, we have excluded “self-force” 
terms, claiming that, for a system of particles, eq. (12.13) must be re- 
placed by eq. (12.16), in which the charge qa feels the effect of the fields 
generated by the charges qẹ with b Æ a, but is not subject to the force 
produced by its own field. While this might sound eminently reasonable, 
it must be acknowledged that this is an “ad hoc” prescription, that we 
have super-imposed on the formalism. In reality, Maxwell’s equations 
instruct us to compute the electric and magnetic fields generated by the 
total charge and current densities, and then these electric and magnetic 
fields act on the charges according to the Lorentz force [in its relativistic 
form, (8.62), or (8.65)]. In Section 12.3.2 we will perform the compu- 
tation of the LPN dynamics including these self-force terms and we will 
show that, up to 1PN, the various self-force contributions either vanish 
or can be reabsorbed into a redefinition (more technically, a “renormal- 
ization” ) of the mass of the particle. On the one hand, this will confirm 
the results of the previous section, putting them on a firmer conceptual 
ground. On the other hand, these results will pave the way for the study 
of radiation reaction at 1.5PN order in Section 12.3.3, where we will see 
that the inclusion of self-force terms is necessary to get the correct result. 
If one would simply discard them by hand, one would not get the energy 
loss to radiation corresponding to the Larmor formula. These self-energy 
terms are therefore absolutely real and, beyond the 1PN level (where, a 
posteriori, the naive approach of throwing them away turns out to give 
the correct answer) they must be included. In Section 12.3.4 we will see 
how to obtain radiation reaction to all orders from a covariantization of 
the 1.5PN result and finally, in Section 12.3.5, we will show how mass 
renormalization and radiation reaction can be obtained from a single, 
fully covariant, computation, that gets rid completely of any notion of 
extended electron. 

First, however, in Section 12.3.1 we compare two different general 
frameworks that can be used to address these problems, either based on 
an extended classical model of the electron, or using regularization and 
renormalization techniques borrowed from quantum field theory. 
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12.3.1 Classical extended electron models vs. 
regularization schemes 


The basic problem that we will have to face, when dealing with self- 
forces, is related to the assumption of exactly point-like charges. As 
we already saw in Section 5.2.2, this leads to a divergent expression 
for the self-energy of the particle. Historically, the problem has been 
first tackled trying to build a classical model of an extended electron. 
However, no consistent and convincing classical electron model has ever 
emerged, despite significant effort, starting already from works of Abra- 
ham, Lorentz, and Poincaré at the beginning of the 20th century. In 
the Abraham—Lorentz—Poincaré model, the electron was modeled as a 
uniformly charged spherical shell. However, hypothetical mechanical 
forces must be introduced to stabilize the electron against the elec- 
trostatic repulsion among its parts. Even when this is done, and the 
forces are chosen so as to render stable a configuration with spherical 
symmetry, the model still turns out to be unstable under non-spherical 
perturbations.!° Furthermore, and most importantly, these hypothetical 
stabilizing mechanical forces do not correspond to anything in Nature. 
We now understand that a consistent theory of the structure of elemen- 
tary particles can only be obtained in the framework of quantum field 
theory. 

Here we will then take a different approach to the problem, inspired 
indeed by quantum field theory, in which these divergences are dealt 
with by using renormalization theory.!! In this approach, one starts by 
regularizing the theory, which means that we introduce a length-scale £Z 
that smooths out the divergences so that, to begin with, we deal with 
well-defined mathematical expressions. For instance, in Section 5.2.2, 
we regularized the self-energy of a charge distribution by removing the 
Fourier modes with wavenumber |k| > 7/@, see eqs. (5.27)—(5.31). Reg- 
ularization must be understood just as a mathematical step. It is not 
meant to correspond, in any way, to a physical model of an extended 
classical electron, and physical results are only obtained in the limit 
£— 0. This eliminates any concern about the the need for hypothetical 
stabilizing mechanical forces, or on the stability of the extended electron 
model under perturbations. The issue, now, rather becomes how to take 
the limit £ — 0 so as to recover finite results. This is obtained realizing 
that, once we need to introduce a cutoff @ in the theory, the various 
parameters that we have introduced at the level of the action (or, in 
our classical context, of the equations of motion), such as the mass and 
charge of the particle, are not yet the observable quantities, but rather 
just parameters that enter in the intermediate steps of the computation 
(“bare parameters”, in the field theory jargon), that can a priori also 
depend on this cutoff. For instance, the action (7.133) must be replaced 
by 


Shoe = —mo(O)e? | drt... (12.84) 


where we admit that the parameter previously denoted by m and in- 


10See Pearle (1982) for a review of 
these approaches, and Damour (2017) 
for a historical discussion of the con- 
tribution of Poincaré to the extended 
electron model. 


it is interesting to observe that this 
approach was already advocated by 
Dirac (1938), when the understand- 
ing of the divergences in quantum field 
theory was still quite limited. From 
Dirac’s 1938 paper: “We shall retain 
Maxwell’s theory to describe the field 
right up to the point-singularity which 
represents our electron and shall try to 
get over the difficulties associated with 
the infinite energy by a process of direct 
omission or subtraction of unwanted 
terms, somewhat similar to what has 
been used in the theory of the positron. 
Our aim will be not so much to get a 
model of the electron as to get a simple 
scheme of equations that can be used 
to calculate all the results that can be 
obtained from experiment.” 
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12 Notice, in particular, that mo,a(@) = 
ma — [1/(4me0¢7)]q2/£ goes to minus 
infinity as £ + 0t so, for £ suff- 
ciently small, it is negative. Most of 
the confusion, in some literature, comes 
from interpreting mo,a(¢@) as a “me- 
chanical mass”, and the term propor- 
tional to q?/@ as an “electromagnetic 
mass.” This nomenclature is already 
misleading, since it implicitly suggests 
that these two quantities have, sep- 
arately, an intrinsic physical meaning 
(typically leading these texts to state- 
ments such as that a negative “mechan- 
ical” mass is unacceptable). In the logic 
of renormalization, which is the stan- 
dard tool of quantum field theory, nei- 
ther of them has any separate physi- 
cal meaning. Their value, and even 
their sign, depends on the regulariza- 
tion scheme used, and only their sum is 
physically meaningful. 


terpreted as the mass, could a priori also depend on £, and is not yet 
the physical, observed, mass. To stress this change of perspective, we 
denote it by mo instead of m, and we call it the “bare” mass. The dots 
in eq. (12.84) indicate that we also admit the possibility of adding extra 
terms with a different structure, that we will discuss below. The crucial 
point is that mo(¢) is not observable. A charged particle, with charge 
da, always comes with its own electric field and, as we saw in eq. (5.31), 
this produces a self-energy term [1/(47re9)](q?/@). Therefore, the total 
rest-energy of a particle, labeled by an index a, is 


1 
Ensen Oe +— =. (12.85) 


This means that the actual observable mass of the particle, that we call 
the “renormalized” mass, is given by 


1 gè 
Anegc? l` 
The logic of renormalization is that the bare mass Mo,a(£) must be seen 
as a quantity which is completely in our hands when we define the theory, 
and which is chosen so that it cancels the divergence of the self-energy 
term for £ —> 0, leaving us with a finite result for the renormalized mass, 
equal to the observed value. By themselves, the two separate terms on 
the right-hand side of eq. (12.86) have no physical meaning. They both 
separately diverge in the limit Z —> 0, just in a way that their divergences 
cancel in the sum, and their value before removing the cutoff depends 
on the regularization scheme that we have chosen. Only their sum is 
physical, and finite.!? 

It should be stressed that this point of view, based on renormalization, 
is not just an optional possibility, when dealing with the divergences that 
appear in the point particle limit. At the scale of elementary particles, 
eventually quantum mechanics and quantum field theory enter the game. 
At that level, divergences such as that discussed above are unavoidable, 
and can only be cured through renormalization. The mass renormaliza- 
tion discussed above will then automatically take place, together with 
the renormalization of other parameters (such as the electron charge). 
From this perspective, the attempts at constructing a classical extended 
model of the electron, where the self-energy term is interpreted as an 
actual finite physical contribution to the total mass, looks futile. At the 
quantum level, the self-energy term will anyhow be divergent, and this 
divergence can only be cured through renormalization. 

This quantum field theory approach also allows us to clarify another 
aspect that has plagued the attempts at constructing classical extended 
models of the electron, which is related to Lorentz invariance. A charged 
shell which is spherical in its rest frame would be deformed, by the 
Lorentz contraction, from the point of view of a boosted observer. There- 
fore a rigid extended model, such as that based on a rigid charged shell 
initially devised by Abraham in 1903-1904, is not Lorentz invariant. As 
a result, the self-energy contribution to the energy and the correspond- 
ing self-contribution to the momentum (obtained from the momentum 


(12.86) 


Ma = Mo, a(l) + 
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associated with the electromagnetic field of the particle, in a frame where 
it has a velocity v) do not form a four-vector. For a rigid charged shell of 
radius b, whose center-of-mass moves with four-velocity u” = (yc, yv), 
the self-energy and self-momentum due to the electromagnetic field of 
the particle are given by! 


1 2 1 1 2 2 [nd 
pi = oul 4 5 ( vie v). (12.87) 


Areo 22 Areo 3 2e | J14 w/o? 


Note that the four-vector notation p% „ here is an abuse of notation, since, 
on the right-hand side, the term proportional to u” is a four-vector, 
but the second term, written giving explicitly its temporal and spatial 
components, is not. Poincaré, in 1905-1906, added suitable mechanical 
stresses, which produce a “mechanical energy and momentum.” Also 
these, separately, do not form a four-vector, but can be chosen so as to 
cancel exactly the second term in eq. (12.87). So, adding this to the 
mechanical four-momentum mou" of the center of mass, in the Poincaré 
model the extended electron has a total mechanical four-momentum 


1 1 2 2 /n2 
peo, = Mou” ( ue v) . (12.88) 


Areo 32b \ ,/1 + (v/o? 


The sum of the two terms, p” = pH, + Pech, is a four-vector, that is 
interpreted as the total four-momentum of the extended electron, 


1 2 
p = (m J 5 ) ul, (12.89) 


Ateg 2bc? 


so that Lorentz invariance is recovered. Apart from the fact that this 
model turns out to be unstable under non-spherical perturbations, and 
therefore eventually is not viable even classically, within our modern 
perspective it is clear that the whole construction is very artificial and 
has nothing to do with the actual description of an elementary particle 
in quantum field theory. 

However, what Poincaré did can be reinterpreted, in the framework of 
renormalization in quantum field theory, as follows. When one regular- 
izes a theory, the symmetries of the original action may or may not be 
respected by the regularization process. For instance, if we regularize the 
Dirac deltas in eqs. (12.17) and (12.18) using an extended rigid model 
of the electron, our regularization breaks Lorentz invariance since, as we 
have discussed, a rigid spherical electron is not consistent with Lorentz 
invariance. The same happens if we regularize the action of a point 
particle by imposing a cutoff on the Fourier modes, restricting to modes 
with |k| < 7/£. Again this is not a Lorentz-invariant condition, since the 
value of the wavenumber k, and of its modulus, changes under a Lorentz 
boost, so the above condition can only be valid in a specific frame. In 
general, there is nothing wrong with using a regularization that breaks 
one of the symmetries of the theory, and this is commonly done in quan- 
tum field theory computations. Simply, in this case one must admit that 


13 See eqs. (7.16) and (11.29) of Pearle 
(1982). 
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M41, quantum field theory, a cutoff over 
wavenumbers, |k| < A (or over mo- 
menta, where, at the quantum level, 
the momentum p is related to the 
wavenumber k by p = hk), is typ- 
ically used only for qualitative dis- 
cussions, precisely because it breaks 
Lorentz invariance, and for actual com- 
putations in general one prefers to use 
other schemes, such as Pauli—Villars 
or dimensional regularization, to avoid 
the need of dealing with counterterms 
that do not respect Lorentz invariance. 
However, there are situations where 
it is necessary to use a regularization 
that breaks Lorentz invariance. In 
particular, for non-perturbative com- 
putations in quantum chromodynamics 
(the fundamental theory of strong in- 
teractions), the best regularization con- 
sists in putting the theory on a space- 
time lattice (furthermore, rotating from 
Minkoswki to Euclidean space). In this 
case the full (Euclidean) Lorentz invari- 
ance is broken to a subgroup of dis- 
crete rotations and, to recover Lorentz 
invariance in the continuum limit, one 
must introduce counterterms that are 
not Lorentz invariant, and only respect 
this smaller symmetry group. There 
are also more complex situations, where 
a symmetry broken by the regulariza- 
tion is not recovered when the cutoff is 
removed. This gives rise to quantum 
field theory anomalies, but these will 
not concern us here. 


the bare quantities (that, we should remember, are not physical observ- 
ables, but just mathematical entities that we choose at our will) do not 
have to respect that symmetry either, and will rather be adjusted so as 
to recover that symmetry at the level of renormalized quantities. So, 
for instance, if we start from the point-particle action (12.84), that in 
this context is called the “bare” action, and we use a regularization that 
breaks Lorentz invariance, we must admit the presence of other terms 
[“counterterms,” in the quantum field theory jargon, indicated generi- 
cally by the dots in eq. (12.84)], that do not need to respect Lorentz 
invariance, and that will be adjusted so as to recover Lorentz invariance 
at the level of the renormalized theory. So, what Poincaré actually did, 
from this perspective, is equivalent to starting from the bare action that 
corresponds to the Abraham model of the electron, interpreted now just 
as a form of regularization of a point charge (that breaks Lorentz invari- 
ance), and adding to it a counterterm which is also not Lorentz invariant, 
and is adjusted so as to obtain Lorentz invariance for the renormalized 
quantities. 

In Section 12.3.2 we will perform a similar but conceptually more 
transparent computation, as follows. We will start from the charge and 
current density of a point-particle, and we will regularize them by im- 
posing a cutoff |k| < m/Z over the wavenumbers of their Fourier modes. 
This will be the equivalent of an extended classical electron model, since 
it amounts to smoothing out, over a distance of order 4, the Dirac deltas 
in eqs. (12.17) and (12.18), except that we make it clear that this is just a 
regularization, and the limit £ + 0 must be taken in the end, so it should 
not be interpreted as an actual model of a classical extended electron. 
As we already mentioned, this regularization breaks Lorentz invariance. 
We will then compute explicitly the corresponding divergences in the 
self-energy and in the selfmomentum (which would provide the result 
analogous to eqs. (12.88) and (12.89) with our regularization of the point 
particle, rather than with the Abraham—Lorentz—Poincaré model) and 
we will then show how to renormalize the theory with a simple non- 
Lorentz-invariant counterterm, so as to recover Lorentz invariance for 
the renormalized quantities. 

The use of a renormalization scheme that involves counterterms that 
are not Lorentz invariant might be unfamiliar even to many advanced 
readers.!+ However, there is no need to break Lorentz invariance with the 
regularization. After working out mass renormalization with the above 
regularization, that breaks Lorentz invariance (and that corresponds to 
the naive idea of an extended classical electron model), in Section 12.3.5 
we will show how to regularize and renormalize the divergences associ- 
ated with point particles in a fully Lorentz-invariant manner, recovering 
mass renormalization in a way that maintains Lorentz symmetry mani- 
fest at each stage. The latter procedure will be completely in line with 
standard quantum field theory computations, and is in fact the best 
starting point for including also quantum effects. 
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12.3.2 Self-energy and mass renormalization 


In this subsection we extend the discussion of Section 12.2, including 
now also the self-forces, for a collection of charges modeled as extended 
charge distributions. As discussed above, in the case of elementary par- 
ticles this must be considered merely as a regularization, and we are 
only interested in taking eventually the point-particle limit, eliminating 
the divergences through the renormalization procedure. Some aspects 
of the formalism, however, can also be useful for actual macroscopic 
bodies, where the extended charge distribution is really the physical dis- 
tribution of the body, rather than a mathematical trick for regularizing 
a point particle.!° So, in the following each particle will be described 
by a generic charge density pa(t,x), localized in a volume V(t) (whose 
position changes in time because the position of the particle changes), 
that we take small compared to the overall volume in which is localized 
the system of charges. We assume that the volumes V, corresponding 
to the different charges are non-overlapping during the whole time span 
for which we follow the time evolution, i.e., that the particles do not 
merge (nor disintegrate). This assumption was already implicit in the 
computation of Section 12.2, see Note 4 on page 301. The position of the 
a-th charged body is then defined by its “center-of-charge” coordinate 


x(t) = > féz palt, x)x. (12.90) 


Note that the integration is actually only over the volume V,(t) where 
the particle is localized, but we can extend it to all of space, because 
anyhow p,(t,x) vanishes outside V,(t). This has the advantage that the 
time dependence of x,(t) enters only through pq(t,x). The velocity and 
acceleration of the particles are given by va(t) = dx,/dt and a(t) = 
d?x,,/dt?. Using the continuity equation (3.30), 


dz’ 1 
Te = | ae depaltx)2" 
= -> | de [On 9% (t, x)] x’ 
1 f 
= +o f Patea 
qa 
1 : 
= + | tajaa), (12.91) 


where we integrated by parts (neglecting the boundary term since pa is 
localized) and we used kz’ = ôi. Therefore 


"a Jines. (12.92) 


a 


In the point-like limit, using eqs. (12.17) and (12.18), eqs. (12.90) and 
(12.92) correctly reduce to the position and velocity of a point charge. 
The simplest model of a current distribution consistent with eq. (12.92) 
is given by 

ja(t, X) = palt, X)Va(t). (12.93) 


15 4 very similar formalism is useful 
in Newtonian gravity (with the charge 
density replaced by the mass density) 
to describe self-gravitating objects, see 
Damour (1987) for pioneering work and 
Poisson and Will (2014) for recent text- 
book discussion. 
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16 We will see below how to obtain these 
different bare masses for the energy 
and momentum from the addition of 
a non-Lorentz-invariant counterterm in 
the action. 


This corresponds to a charge distribution that moves with a veloc- 
ity va(t), without superimposed internal motions. Taking a different 
modelization would simply complicate the explicit computation, adding 
terms that vanish when, at the end of the computation, we eventually 
take the point-particle limit. We will therefore assume the form (12.93) 
of the current density, for our extended model of the current and charge 
distribution. 

We regularize the theory by putting a cutoff |k| < 7/2, i.e., setting to 
zero all Fourier modes p(t, k) of the charge distribution with wavenumber 
higher that 7/€, as we already did in Section 5.2.2. In coordinate space 
this corresponds to smoothing the Dirac delta, with most of its support 
being concentrated up to a distance of order £ from its center. The point- 
like limit is recovered as £ + 0. As we already mentioned, this is not a 
Lorentz-invariant regularization, since the value of |k| changes under a 
Lorentz boost, so the condition |k| < 7/é is not Lorentz invariant. 

With this regularization, the contribution of the self-field of the parti- 
cle to its rest energy was already computed in Section 5.2.2 and is given 
by the second term on the right-hand side of eq. (12.85). This self- 
energy contribution is reabsorbed into a mass renormalization, given 
in eq. (12.86). We now compute the contribution of the self-field of 
the particle to its spatial momentum, to understand how the full four- 
momentum renormalizes. To this purpose, we consider the equation of 
motion of a charge and current distribution written in the form of the 
Lorentz “force” equation (3.68). For a single extended object, we rewrite 
it as d 

Pe | dz (pE +jxB). (12.94) 

dt v 
The crucial point is that, in the relation Pa = Y(Va)MaVa, the mass must 
again be taken as a bare parameter, that depends on the cutoff and that 
will be adjusted so as to obtain the desired value for the renormalized 
mass. Furthermore, since our regularization breaks Lorentz invariance, 
this bare parameter is a priori different from the one that enters in the 
energy, and that we denoted by mo,a(@) in eq. (12.86). We will then 
denote it by 70,a(2).'° Then, including also the self-force term that we 
have omitted in our treatment in Section 12.2, the equation of motion 
of the a-th particle becomes 


d 
dt 


[y (va) Moa (l) Va] = Fa ext 
+f èz [pa(t, x)Ea(t, X) + ja(t,x)xBa(t,x)], (12.95) 


where Fa ext is the contribution to the equation of motion of the a-th 
particle from the other particles, while Ea and B, are the electric and 
magnetic fields generated by the a-th particle itself. In Section 12.2 we 
studied this equation with only Fa,ext on the right-hand side, and we 
threw away by hand the self-force, i.e., the effect of Ea and Ba on the 
a-th particle itself. The new aspect of this computation is that we now 
take it into account explicitly, and we will see how, eventually, the result 
of Section 12.2 can be justified. 


12.3 Self-force and radiation reaction 319 


These self-fields are given by 


E, = —V¢a—0:Ac; (12.96) 
Ba = VxAa, (12.97) 


where ¢, and A, are, respectively, the scalar and vector potentials gen- 
erated by the particle a. From eqs. (12.1) and (12.2), in the Lorenz 
gauge they are given by 


Lf a palt- ix —x!l/c,x!) 
T 12. 
Galt, x) ATE [e |x — x’| , aes) 
eel 1 
Ato = = pa Weta k= xN/6X) 49 99) 


Arreg > |x — x’| 


We consider first the contribution to the self-force in eq. (12.95) coming 
from the electric field, up to 1PN order. Expanding eq. (12.98) up to 
1PN, and working in components, the contribution of the scalar potential 
to the right-hand side of eq. (12.95) is 


pee Palt, X) [—O;¢a(t, x)] = ape | brats palt, xX) 
xð; = =x -= x'I/c, =] 


Ix — x'| 


= — f èz dx’ palt, x) 


 Areg 
a t,x x—x! 
ði a Lapaltix) + oe P pa(t,x’) +... . (12.100) 


The first term of this expansion is the Newtonian self-force, since it is the 
only term that survives in the limit c —> co (even when we will include 
the contribution due to A, since A starts from order 1/c”). However, 
computing explicitly the derivative 0; = 0/02", 


£ 1 
PED = f Pada palt, x) pall!) 
|x — x’| |x — x’|8 
(12.101) 
and this expression vanishes, independently of the functional form of the 
charge density pa(t,x), since the integrand is odd under the exchange 
x © x’ (while, since the integrals in dx and d°z’ can be extended to all 
R®, the integration domain is invariant under such exchange). Therefore, 
the Newtonian self-force vanishes, for any extended charge distribution. 
Consider now the second term in the expansion in eq. (12.100). This 
is formally a term of order 0.5PN, since it is proportional to 1/c, but 
again it vanishes, simply because ð; is the derivative with respect to x, 
and it acts on O:p,(t, x’), which depends on x’ but not on x. The third 
term, which is a 1PN correction, requires a more involved computation. 


[bade palt, x)O; 
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We manipulate it as follows: 
j Pai [-Aba(thpn 
3 na PO nal bie oe 2 / 
“ia z wz f Pada palt, x) Olx = x'D) Boalt x’) 
dan tk 1 
= qf Pads! Palt, x) a Or ( Oa (tx ) 


i A |x — x’| Oxi, 


O t= 
San Jo oat alt 4 
Ta, 22 a fa xd T P ,X) (sa |x — =) GR (t, x’) 


qf Pada! palt, x)O.j*(t, x’) 


1 [ox (wi = zi) (2x =] , (12.102) 


|x — x’| |x — x’|? 


-z 22 


where, to go from the second to the third line, we used the continuity 
equation (3.30), and in the next line we integrated 0/Ox}, by parts. 

The 1PN contribution to the electric field coming from A, is obtained 
neglecting retardation in j,(¢ — |x — x’|/c,x’) in eq. (12.99), since A, is 
already proportional to 1/c?, so is given by 


1 1 Br $ djk (t, x’) 
4Teo c |x — x’| 


—0,A* (t,x) = — (12.103) 


Putting together the contribution from —V a and the contribution of 
—0O:Aq to the electric field Ea, we find that, at 1PN order, 


| I a x) Bilt, o| > 


3 Fad 
= Ia af rdx pa(t,x)O.j* (t, x’) 
(£i — 2;) (Tk — Th) 


|x — x’|? 


bu + (12.104) 


“EE x’ | 
The above steps are valid for a generic function j(t,x). We now insert 


the modelization (12.93). Then, in eq. (12.104), 


Orja(t, x’) = palt, x”) Valt) t [3 pa(t, x’)] Va(t) 
= pax t= [Teea C210) 


The second term on the right-hand side is of order v2, since ja is pro- 
portional to Va, so 


Orja(t, x’) = pa(t,x’)Va(t) + O (v2) . (12.106) 


We insert this into eq. (12.104) and we limit ourselves to the contribu- 
tions linear in va. This will be sufficient to understand how renormaliza- 
tion works for the spatial momentum, and can in principle be extended 
order by order to include higher powers of vg (in the explicitly covariant 
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computation that we will perform later, all these higher-order terms will 
be automatically included). Then, we get 


if dx putts x\Ba(t2)| = 
1 


PN 
1 .; 
=S OLZA Palt, x)pa(t, x’) 
1 (zi — 4) (£; — 24) 


It is convenient to choose p(t, x) so that it is spherically symmetric, i.e., 
invariant under rotations around the center of the distribution, defined 
by eq. (12.90).1” Then, by symmetry, the integral in eq. (12.107) can 
only be proportional to 6,;, since, if pa(t,x) and pa(t,x’) are spheri- 
cally symmetric, there are no privileged directions inside the integral.'® 
Therefore, in the integrand, we can replace 


(12.108) 


and we get 


| f a palt Ealt] 


1 2va(t) 33 Pa(t, X)pa(t, x’) 
~ Ameg 3 C2 [evs Ix—x’| 

(12.109) 
If we replace here pa(t,x) and pa(t,x’) by Dirac deltas, the integral 
diverges. However, this is the same integral that we met in Section 5.2.2, 
and with our regularization in which we set to zero the Fourier modes 
with |k| < 7/4, it is given by [compare eqs. (5.25) and (5.31)] 


1PN 


1 Palt,x)palt,x’) _ qa 
= | Prr i =e. 12.11 
al os |x — x’| £ ( 0) 
Inserting eq. (12.110) into eq. (12.109), we get 
Af 1 ÈN valt 
|f ae palt xE] = ( 2) Val ) (12.111) 
IPN 3 \4reo £ c 


Since we are limiting ourselves to terms linear in v,, the contribution 
of the term ja x Ba in eq. (12.95) can be neglected, since Ba, as Aa, is 
proportional to ja and therefore ja x Ba is quadratic in ja and therefore 
in va. Then, to linear order in va, the equation of motion (12.95) reads 


- f 4 1 02N Va 
a = a ł Foex : 12.112 
Moale)? 3 G= £ ) c aa ( ) 
We can rewrite this as 
- 4 1 da 
all = "| v4 = Faiett 12.113 
foal +5 (qo) [ve Foon (12.113) 


and we see that the self-force term can be reabsorbed into a renormal- 
ization of the mass, choosing the bare parameter Mo,a(l) so that 


4( 1 \@ 
3 \Aregc? ) L’ 


(12.114) 


Ma = Moal) t 


17 Recall that p(t,x) is just a regular- 
ization of the Dirac delta. Our regular- 
ization has been defined by the condi- 
tion that the Fourier modes with |k| > 
m/l vanish, and this condition is in- 
variant under spatial rotations. We 
are now further requiring that the non- 
vanishing Fourier modes f(t,k) actu- 
ally depend only on |k|, rather than of 
the full vector k, so as to give a ro- 
tationally invariant distribution p(t, x). 
This assumption is just useful to sim- 
plify the computation. In any case, 
any deviation from spherical symmetry 
would give vanishing contribution when 
removing the cutoff and approaching 
the Dirac delta distribution, which is 
spherically symmetric. 


18We can check this explicitly, choos- 
ing a reference frame so that, at a 
given time t, xa(t) = 0, so rota- 
tions around xq are the same as ro- 
tations around the origin of the refer- 
ence frame. Then, a spherically sym- 
metric function pa(t, x) is invariant un- 
der any transformation that leaves |x| 
invariant; one such transformation is 
x = (x,y,z) > (—2, y, z). If, in the in- 
tegral in eq. (12.107), we transform si- 
multaneously x = (x,y,z) > (—2, y, z) 
and x! = (2, y',2') = (—2',y’, 2’), 
the factors pa(t, x), Pa(t, x’) are invari- 
ant. Also |x — x’| is invariant, since 
Ix — x"? = (e = x’)? + (y — y’)? 4 
(z — 2’)? is unchanged if, simultane- 
ously, x > —x and 2’ > —2’. In con- 
trast, the term (x; — 2{)(aj — £4) with 
i = 1 and j # 1 changes sign; there- 
fore, the part of the integrand propor- 
tional to (x; — 2})(xj; —2/,) is odd under 
this transformation, and its contribu- 
tion to the integral vanishes. Similarly, 
all other terms with i Æ j vanish. When 
i = j, in contrast, the result is indepen- 
dent of the value of i, again because of 
rotational symmetry: the integral with 
i = j = 1, which involves (x — «')?, is 
the same as that with i = j = 2, which 
involves (y—y’)? or that with i = j = z, 
which involves (z — z’)?. 
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191) a full quantum setting, also the 
charge q should be replaced by a bare 
parameter go(@) and renormalized, al- 
though this will not concern us at the 
classical level. 


where Ma is the observed value of the mass of the charged particle, say 
of the electron. 

Note that the self-field contribution to the momentum differ by a 
factor of 4/3 from the self-field contribution to the rest energy, resulting 
in the different 4/3 factor between the second term on the right-hand 
sides of eqs. (12.86) and (12.114). As we already anticipated, within 
the renormalization logic, there is nothing surprising about it. Simply, 
we have broken Lorentz invariance with the regularization, and we must 
then use two different bare parameters mo,a(@) and Mo,a(£), associated 
with energy and to momentum, respectively, in order to recover the same 
value of the renormalized mass Ma. 

The conclusion of this section is that the “naive” treatment of Sec- 
tion 12.2, in which the self-energy term where simply discarded when 
studying the 1PN dynamics, eventually gives the correct result because, 
when the self-energy terms are correctly taken into account, to 1PN 
order they can just be reabsorbed into a renormalization of the mass. 
This requires first a regularization of the theory. If, as we have done 
in this section, we use a regularization that breaks Lorentz invariance, 
then we must use two different bare mass terms for energy and for spa- 
tial momentum, in order to reabsorbe the divergences. However (despite 
the fact that the extra 4/3 factor between eqs. (12.86) and (12.114) has 
created often confusion, to the extent of being called “the infamous 4/3 
factor” ) within a proper approach based on regularization and renormal- 
ization this is just a minor technical point, of no special consequence. 
In Section 12.3.5 we will show how to regularize and renormalize the 
theory in a fully Lorentz-covariant manner, and then a single bare mass 
term will be sufficient to renormalize the four-momentum. 

To conclude this section we observe that, in the classical theory that 
we are considering, we can equivalently discuss renormalization at the 
level of the equations of motion, as we have done, or at the level of the 
action. To make contact with the quantum field theory treatment, where 
one rather works at the level of the action, it is useful to show explicitly 
the form of a bare action that corresponds to the introduction of two 
different bare mass terms for the rest energy and for spatial momentum. 
Consider the bare action (suppressing for simplicity the label a of the 
particle) 


Somo = moO | dr- zimo) — MoO) f at 
+q firuso). (12.115) 


The first and third terms were already given in eq. (8.69). The first 
term describes the action of a free particle, with m now replaced by a 
bare parameter mo(£), and the third describes its interaction with the 
electromagnetic field.!° In particular, the third term is responsible for 
the form (8.1, 8.2) of the charge and current densities (or, equivalently, 
of the covariant expression (8.3) for j”), as we saw in eq. (8.73). Sup- 
plemented with the regularization |k| < 7/2 on the Fourier modes of the 
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Dirac delta, this action was the starting point of our computation. The 
second term in eq. (12.115) is proportional to the non-relativistic action 
of a free particle and is therefore a counterterm which is not Lorentz- 
invariant. The corresponding Lagrangian, limiting ourselves to the free 
part, is 


pe Oe í slo moO. (12.116) 


Keeping the terms up to order v? (which are enough to compute the rest 
energy and the term in the momentum linear in the velocity), we have 


L=—mo(£)e? + Imoo? + O(v*). (12.117) 


Therefore the bare rest energy is still Frest = mo(4)c?, while, to linear 
order in v, the bare momentum of a free particle is p = dL/dv = mo(£)v. 


12.3.3 Radiation reaction at 1.5PN order 


As we have discussed in Section 12.1, the radiation field itself cannot be 
computed within the PN expansion: radiation appears in the far zone, 
while the PN expansion is only valid in the near zone. However, the 
PN expansion allows us to study the dynamics of the system of charges 
in the near zone so, from the PN expansion, we must be able to see 
that the mechanical energy of the system of charges decreases, so as to 
compensate for the energy that is carried away by the electromagnetic 
waves radiated by the system. In Section 12.2 we found that, up to 1PN 
order, the dynamics of a system of point particles is conservative. The 
existence of dissipative effects can be found by pushing the PN expansion 
up to 1.5PN order, i.e., up to terms proportional to 1/c?. This could 
have already been anticipated from the fact that the power radiated by 
a non-relativistic charge is proportional 1/c?, as we see from Larmor’s 
formula (10.148), while its relativistic generalization (10.156) has a more 
complicated dependence on c that, again, when expanded in powers of 
1/c, starts with the O(1/c?) Larmor’s term. 

In this section we then compute the 1.5PN contribution to the gauge 
potentials, and therefore to the equations of motion. In Section 12.3.4 
we will show how a covariantization of the result gives the expression 
for radiation reaction to all orders, leading to the so-called Abraham- 
Lorentz—Dirac (ALD) equation. In Section 12.3.5 we will finally show 
how mass renormalization and the ALD equation can be derived in a 
unified treatment, with mass renormalization obtained from an explicitly 
Lorentz-invariant computation. 

Given that we are looking for dissipative terms, it will be important to 
compute the effect on the equations of motion starting from the Lorentz 
force equation, rather than working at the level of the action. We then 
expand eqs. (12.1) and (12.2) to 1.5PN order, and insert them in the 
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equations of motion. The 1.5PN contributions are 


1 1 

$1.5PN (t, x) ce ast | da! olt- x, (12.118) 
Bo ta faii 

Aispn(t,x) = ET Oe j(t,x’). (12.119) 


Using eq. (11.102), the vector potential can also be rewritten in terms 
of the electric dipole moment d(t) as 


1 1. 


Ai spn(t, x) = d(t). (12.120) 


4neg C3 


Note that Ay. s5pn(t,x) actually depends only on t, and is independent 
of x. To compute the corresponding electric field we observe that 


1 1 
V 01.5PN "he at | da! olt, x2- x’) 
1 1 
= o; d(t 
T Galt Qx- a(i) 
It Tag 
= —d(t 12.121 
Atreg 303 OE ( ) 


where we observed that the term Qx is independent of time and gives 
zero when we apply 0; to it. Then 


Ei spn(t,x) = —Vo1s5pn — 0:A1.5PN 
1 Qo. 
= —-.d(t). 12.122 
Arey 303 6) ( ) 
For the magnetic field we have By, 5pn(t,x) = 0 since Ay 5pn is inde- 
pendent of x, and therefore Vx A;.5pn = 0. Putting these expressions 
in the Lorentz force equation, we get the 1.5PN contribution to the 
equation of motion on the a-th particle, 


dpa 1 2qa+s 
= —~ d(t 
( dt J 4Teo 3c3 (i (12.123) 


where, as usual, pa = YaMaVa. Combining this with eq. (12.50), we can 
write the equation of motion, up to 1.5PN order, in the form 


dVa 


Ma- = (Fy + Fipn + Fispn)a; (12.124) 
where 
1-5 h 
(Fn )a = S Lab; (12.125) 


Are r 
0 b4a ab 


is the Newtonian force on the a-th particle, (Fipn),q is obtained collect- 
ing all the terms proportional to 1/c? in eq. (12.50), and 


1 2da-y 


F a = — = dėt). 12.126 
( 1.5PN) Arey 303 ( ) ( ) 
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The notation in eq. (12.124) is suggestive of the Newtonian equation of 
motion F = ma. The analogy, however, is purely formal. For instance, 
in Fipyn we have collected all terms proportional to 1/c? in eq. (12.50), 
including the term that comes from the expansion of Ya in P = Ya™MaVa; 
which has nothing to do with the interaction of the a-th particle with 
the other particles. This notation, however, will be useful in the steps 
that we perform below. 

We now compute how the energy changes in time. First of all, we must 
define energy in this context. This is done by looking at the conservative 
part of the dynamics (as obtained eliminating the gauge fields in favor 
of x(t) and vq(t) at the level of the action), so that we can define a 
Lagrangian and the corresponding Hamiltonian. In our case, since we 
are working up to 1.5PN order, the conservative dynamics include the 
terms up to 1PN order, and the Hamiltonian is given by eq. (12.71). We 
can then write the Hamiltonian as 


H=Hy+Hpn, (12.127) 
where 
a Pi Jado 
Hwy = a 12.128 
a p> 2Ma +o 3s Tab ( ) 


while Hipy can be obtained writing P, as in eq. (12.66), and then 
collecting the terms 1/c? terms in eq. (12.71). So, the corresponding 
energy is 

E = Eyn + FEipn, (12.129) 
where the Newtonian part is given by the usual form 


N 
1 
Ey=)_ Mava +U, (12.130) 


a=1 


with the potential energy U given by 


== S 5o dott ene (12.131) 


r 
a=1 b4a ab 


while F\py can be written in terms of positions and velocities (rather 
than positions and momenta) by using eq. (12.66) in Hy (and using 
simply Pa = Mava in Hipy) and collecting the terms proportional to 
1/c?. We now write 


N 
dE d lw 
dt dt > ice 


N 
= d(maVa) dU dE\pn 
= 2 Eo aE oe 


_ dE\pn 
= dt 


N 
Il 
m 


dU dEipn 


N 
= Fy +F F A ad 
5 ( N + Fipn + FisPN)a ‘Va + a 


, (12.132) 
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20 Indeed, in all standard textbook 
derivations, after having obtained 
eq. (12.123), it is simply stated, with- 
out much explanation, that the term 
F1.5pn on the right-hand side is a force, 
that makes a work Wa = (F1.5PN)a ‘Va 
on the particle a. While this eventu- 
ally leads to the correct answer, with- 
out the more explicit steps that we have 
performed it would be unclear why one 
should use a non-relativistic formula 
such as W = F-v in our relativistic 
context. The derivation that we have 
provided also makes it clear that the 
energy that appears in dE/dt, when we 
compute radiation reaction up to 1.5PN 
order, is the energy obtained from the 
conservative part of the dynamics up to 
1PN order. 


21 One might ask what happens if we 
rather insert eqs. (12.118) and (12.120) 
into the action (12.16), and carry 
out the same steps that we did in 
eqs. (12.51)—(12.55) when we computed 
the 1PN Lagrangian. The result is that 
the resulting 1.5PN contribution to the 
Lagrangian is a total time derivative, 
and therefore does not affect the equa- 
tions of motion. As we already dis- 
cussed, the dissipative term cannot be 
obtained from a Lagrangian, and we 
must rather eliminate the gauge fields 
at the level of the equations of motion. 


where, in the last line, we used eq. (12.124). The crucial point, now, is 
that, as we have seen, the dynamics up to 1PN order is conservative, so 
E = Eyn + Epy is conserved if we use the equations of motion up to 
1PN order, i.e., dE /dt = 0 when F1 spy is not included in the equation. 
Therefore, 


N 
dU dE 
XO (Fy +Fipy)q Va + = + —* =0. (12.133) 
=, di dt 
Then, eq. (12.132) gives 
N 
dE 
Fe De (FLsPn) Va (12.134) 
a=1 


Note that, formally, (Fi5pn), ‘Va is the same as the work that would 
be made by a Newtonian force F,.5py on the non-relativistic particle a, 
in the context of non-relativistic mechanics. Once again, the analogy 
is purely formal, and comes from the fact that we have written the 
equation of motion, including non-relativistic corrections up to order 
(v/c)%, in the form (12.124), where Fy + Fipn formally plays the role of 


a conservative Newtonian force while F4 spy of a dissipative force.?0:21 
Inserting eq. (12.126) into eq. (12.134) we get 
N 
dE 1 2 +s 
— = — — d(t): aVat) 12.135 
dt  4reo 3c3 () D avali) ( ) 


However, qaVa = da, and DF d, = d, where d is the total dipole 
moment of the system. Therefore, we can rewrite eq. (12.135) as 


dE o 1 2 me 
dt — Arren 303 
E 2 Pe sew : 
= d-d) — |d|?| . 12.136 
4reo 3c3 K pie | ( ) 


The second term in eq. (12.136) corresponds precisely to minus the en- 
ergy radiated by the Larmor formula, eq. (10.148), and reproduces the 
fact that the energy of the system decreases because it radiates elec- 
tromagnetic waves. To deal with the first term, different options are 
possible. One possibility is to average this equation over a time T. De- 
noting this average with angular brackets, we have, in particular, 


= T/2 N 
(adja z/ a (aa). 


= 12.137 


The right-hand side vanishes if d(t) is a periodic function with period 
T, or if we send T —> œ with d(t) that vanishes for t + +00. In that 
cases, we get 


dE, _ 1 2 
dt! —— 4reg 303 


( (\d|?) . (12.138) 


12.3 Self-force and radiation reaction 327 


In this way, energy conservation is recovered, at least in an averaged 
form: eq. (12.138) shows that the energy of the system of charges de- 
creases, exactly in such a way to compensate (when averaged over one 
period of the source motion) the energy radiated in electromagnetic 
waves, which, to the order 1/c? to which we are working, is given by 
the Larmor formula (10.148).?? 

The need for a time averaging, however, is not really satisfying; classi- 
cally, we expect that energy conservation should be valid instantaneously 
(and should not be restricted to periodic motions, or to situations where 
the source becomes static as t + +00 and energy conservation is only 
recovered as an average over a time T — oo). However, we can get rid 
of the averaging procedure if we define 


1 2 
A4nt€g 3c3 


122 a 
= ~ deg a 5 dadbVa' Vb - 
a,b=1 


E1.5PN (12.139) 


Since this quantity is proportional to 1/c%, we can interpret it as a 
1.5PN contribution to the energy. Recalling that, on the left-hand side 
of eq. (12.136), E = En + Eypn, we can then rewrite eq. (12.136) as 


d 1 2p. 


q EN t E\pyn H E\ spn) = ~ Arey 33 (12.140) 


In this way, the 1.5PN dynamics produces both a conservative term, that 
contributes to the energy and is localized in the near region, and the 
dissipative term that describes the loss of energy to radiation escaping 
at infinity.?° 

It is interesting to observe that the energy balance equation (12.140) 
has been obtained using eq. (12.126) for the force (Fy.5pn)q acting on 
the a-th particle and, in this expression, the dipole moment d is the 
one obtained from the total charge density p, without excluding the 
contribution from the a-th particle itself, on which the force acts. In this 
way we correctly recovered the power radiated to order 1/c?, as given by 
Larmor’s formula. If we had excluded the self-term by hand, as we did 
when computing the 1PN dynamics in Section 12.2, in eq. (12.126) d(¢) 
would have been replaced by }%),4, de(t), i.e., by d(t) — da(t), and this, 
when inserted into eq. (12.134), would not have reproduced correctly the 
Larmor formula. Rather, on the right-hand side of eq. (12.140), instead 
of |d|?, we would have found the combination (|d|? — $, |da|?). We 
see that the self-energy terms are indeed physical, and are needed to 
obtain the correct radiation reaction force. In the computation to order 
1PN that we performed in Section 12.2 we had arbitrarily thrown them 
away. However, this “mistake” turned out to be without consequences 
because, as we have shown in Section 12.3.2, once one includes them, 
their contribution to the 1PN dynamics can just be reabsorbed into a 


22Note that, in the Larmor formula 
(10.148), we used t to denote the time 
of a distant observer, and tret was the 
retarded time, i.e., the time at which 
the radiation was produced, which cor- 
responds to the variable that we are de- 
noting by t here. 


23 Observe that, for a conservative sys- 
tem, described by a Lagrangian, there 
is a natural and unique definition of the 
energy, in terms of the Hamiltonian cor- 
responding to the given Lagrangian. As 
we have seen (see Note 7 on page 304), 
this Hamiltonian is automatically con- 
served on the solutions of the equa- 
tions of motion. For a dissipative sys- 
tem there is no such unique definition 
of energy. However, having at our dis- 
posal an energy balance condition such 
as eq. (12.140), the natural definition is 
to include in the energy all terms that 
appear inside the time derivative on the 
left-hand side of eq. (12.140). 

Also note that F1.5pn is itself a total 
time derivative, since eq. (12.139) can 
be rewritten as 


1 1 d 
Areo 3c3 dt 


|d|? . 
(12.141) 
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24n gravity, as described by Gen- 
eral Relativity, dipole radiation is ab- 
sent and the leading term in the ra- 
diated power is given by the (mass) 
quadrupole radiation, and is propor- 
tional to 1/c>. Correspondingly, also 
the back-reaction force starts at 2.5PN 
order. See Maggiore (2007), Sec- 
tion 5.1.7. 


mass renormalization. We see, however, that at 1.5PN order they give 
a finite contribution, which is essential to recover the correct energy 
balance equation. Also note that, since this 1.5PN contribution is finite, 
we could compute it without specifying a regularization procedure. 
Another comment concerns the fact that radiation reaction first ap- 
pears at order 1/c, i.e., is associated with an odd power of 1/c. This 
is related to time reversal invariance. As we showed in Section 3.4, 
Maxwell’s equations are invariant under time reversal. However, as we 
already discussed (see in particular Note 39 on page 86), the symmetries 
of the equations are not necessarily the same as the symmetries of their 
solutions. The other option is that we have a family of solutions, that 
transform into each other under the given symmetry transformation. In 
the case of a discrete transformation such as time reversal, t > —t, 
this means that we could have two solutions, that transform into each 
other under time reversal. This is indeed what happens when solving an 
equation such as DA“ = 0, since, as we have seen in Section 10.1, the 
d’Alembertian operator has two Green’s functions, the advanced and the 
retarded ones, that are exchanged under time reversal, as we see from 
eq. (10.24). In our computations we have explicitly broken time reversal 
invariance by selecting, for physical reasons, the retarded Green’s func- 
tion. Consider now a system of charges, evolving under their mutual 
interaction. As we have seen, they will emit outgoing electromagnetic 
waves and, correspondingly, they lose energy. The time-reversal of this 
solution is obtained “running the film backward,” and corresponds to the 
rather strange situation in which incoming electromagnetic radiation im- 
pinges on the system of charges and pumps energy into it, furthermore 
exactly canceling any outgoing radiation generated by these accelerated 
charges (just as, in the original setting, there was no incoming radiation 
on the system). In practice, this is a solution that we will never observe 
in Nature because it requires incredibly fine-tuned initial conditions on 
the incoming radiation, but still it is a solution of Maxwell’s equations 
coupled to the sources. Exchanging the retarded with the advanced so- 
lution must therefore result in a change of sign of the radiation-reaction 
force, since, in the time-reversed situation, it will have to describe an 
energy gain rather than an energy loss. However, the exchange of the ad- 
vanced and retarded Green’s functions (10.24) can be formally obtained 
with the replacement c => —c. This means that the radiation-reaction 
force is necessarily associated with odd powers of 1/c, i.e., with half- 
integer orders of the PN expansion. As we saw, the scalar potential 
vanishes at 0.5PN order because of charge conservation, and the vector 
potential starts from 1PN order, so the first order at which radiation 
reaction could have appeared is 1.5PN, as indeed we have found.?4 


12.3.4 The Abraham—Lorentz—Dirac equation 


In eq. (12.126) we found the radiation reaction force at leading order in 
v/c, which turned out to be the 1.5PN order. As we saw, this repro- 
duces the energy lost to electromagnetic waves by a system of charges, 
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as predicted by Larmor’s formula (10.148). However, we found in Sec- 
tion 10.6.1 that Larmor’s formula is just the power radiated to lowest 
non-trivial order in v/c, and the full result for the power radiated by a 
point charge, to all orders in v/c, is given by eq. (10.156), or, in covariant 
form, by eq. (10.169). It must therefore be possible to find an expression 
for the radiation reaction force, valid to all orders in v/c, which corre- 
sponds to the full energy loss (10.156). To obtain it, one could compute 
explicitly the exact self-field generated by a particle, to all orders in 
u/c. We will follow this route in Section 12.3.5, where we will see that 
this is indeed possible but the computation, while instructive, is quite in- 
volved. A much simpler procedure, that we follow in this section, consists 
in looking for a covariant generalization of eqs. (12.124) and (12.126). 
We focus on a given single particle a, and we rewrite eqs. (12.124) and 
(12.126) separating the total dipole moment d(t) = > d,(t)(t) as 
d(t) = da (t) + ee d,(t), and we reabsorb the term >7,2, dy into the 
external force exerted on particle a. Then, writing d,(t) = qaXa(t), so 
that dalt) = qa¥a(t), we rewrite eqs. (12.124) and (12.126) as 


cf rs ca ON op (12.142) 
dt a2 ai = 4Treo 38 dt? xia i 


where we also included the expansion of y(va) to order v2/c?, that in 
eq. (12.124) was formally included in Fıpy, back to where it belongs, as 
a multiplicative factor for MaVa- 

We now proceed with the covariantization. In the absence of the 
backreaction force, the covariantization of eq. (12.142) is just given by 
eq. (8.62) with FH” = FMY i.e., 


ext? 


Mot” = qa Pha lEalT) tans (12.143) 
where the dot denotes the derivative with respect to 7, and we have 
written explicitly that, in the Lorentz force equation, PHY (a ) must be 
evaluated on the particle world-line. We next observe that, on the right- 
hand side of eq. (12.142), d?v' /dt? is naturally covariantized as the spa- 
tial component of ti, where the dot denotes the derivative with respect 
to proper time Ta of the particle. However, this covariantization is not 
unique, since any expression of the form ti + au#, with a an arbitrary 
function of uy, ù% ü% and possibly of higher-order derivatives, is such 
that its u = i component that reduces to d?v! /dt? in a frame where the 
particle a is instantaneously at rest. This leads us to 


1 2¢? 
mat = oe 
one Arey 303 


lüh + (ug, Ùa Üa- .)UE] + GaP ee [@a(T)] tan - 
(12.144) 
Since u# is the only four-vector whose spatial components vanish in the 
rest frame of the charge, no further freedom is possible. The function 
a(u’, ù”, ü”,...) can then be fixed by observing that, taking the deriva- 
tive with respect to T of the relation upu” = —c?, it follows the identity 
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25 Observe that the relative sign among 
the two terms in parenthesis depends 
on our signature nv = (—,+,+,+). 
With the opposite signature, in 
eq. (12.147) one would have the 
combination [ük — (uYtia,, /c?) ub], 
and in eq. (12.149) would appear the 
combination [ük + (u2/c?)ul]. 


26 Equation (12.151) was first found by 
Lorentz, while the relativistic expres- 
sion was first found by Abraham in 
1904, so the year before Einstein’s first 
paper on Special Relativity. Despite 
the fact that Special Relativity was yet 
to be formulated, he could get a rela- 
tivistic result by using Maxwell’s equa- 
tions (which, as we now know, im- 
plicitly contain Special Relativity!), see 
Rohrlich (2000) for a historical discus- 
sion. The covariant form of the equa- 
tion was first explicitly derived by Dirac 
in 1938. In Section 12.3.5 we will pro- 
vide a covariant derivation, conceptu- 
ally along the lines of Dirac’s deriva- 
tion, although somewhat different tech- 
nically. 


u ù” =0. Then, multiplying eq. (12.144) by Ua,u, we get the condition 


0 = Ua, [ite + alu, Ùa, Üa»: DA 

= Ua pü” — Palut, te te...) (12.145) 

and therefore 1 
a(ug, Ùa Üa: --) = -z Uata, - (12.146) 

c 

Inserting this into eq. (12.144), we finally get 
ai 1 2q? Para ug tay pv 

maiti = g F (ait ier ut) + ga FB lta( Tay» (12147) 


We can rewrite eq. (12.147) in an equivalent form by taking one more 
derivative of the identity uù” = 0, to get ùp ù” + u pü” = 0, and there- 
fore 

uü” =ù’, (12.148) 


so eq. (12.147) can also be rewritten as? 


1 2¢? 
Mute = a 
one Ateo 3c3 


ù 
(x: z2 ut) + qa Phala lT) tav 


(12.149) 
Equivalently, eq. (12.147) can also be written as 


1 242 utut \ .. 
n G 3 hac + qa Fha alt Meds 


(12.150) 
which explicitly displays the tensor structure (n#” + u¥u¥ /c?), which is 
transverse to Ua, „- Equation (12.150), or any of its equivalent forms, is 
called the Abraham—Lorentz—Dirac (ALD) equation. Its non-relativistic 
limit, including both corrections of order 1/c? and 1/c?, is given by 
eq. (12.142). Actually, in the non-relativistic limit, if one is interested in 
the energy loss of the particle, one can keep only the leading dissipative 
effect, which is the term 1/c?, neglecting the correction of order 1/c? to 
the conservative dynamics, and write 


dva 1 20d NVa 


= Fex ; 
dt 4reo 3 dt? Tieg 


(12.151) 


Ma 


which is called the Abraham-Lorentz equation.?® In terms of the four- 
momentum p#, and writing explicitly the dot as d/dr, eq. (12.149) reads 


dpe 1 2q? dp” 1 2q? dPa v dPa\ p 
dr Adreg 3Mac? dr? 4rneo 3m5 \ dr dr g 


da v 
Hng etltalT)]pav : (12.152) 
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Comparing with eq. (10.169) we see that the second term, 


1 2q? dpa v dph 
PE = f a Me 12.153 
rad 4rreg 3m3 ( dr dr Pa» ( ) 


is precisely the negative of the four-momentum radiated by a point parti- 
cle, to all orders in v/c. This term, therefore, is a dissipative term in the 
equation of motion, that accounts for the loss of energy to electromag- 
netic waves, exactly in u/c. F% 4 is therefore called the radiation-reaction 
“force” (of course, it is a four-vector, but, following common usage, we 
will refer to it as a force, rather than as a “four-force”). The first term, 


1 24 Cph 
Areo 3Mac3 dr? ’ 


Schott = (12.154) 
is called the Schott term. The total self-force, due to the particle self- 
field, is therefore the sum of the radiation-reaction term and of the Schott 


term, 
Prat = Fina + Fgchott « (12.155) 


We observe that the Schott term is a total time derivative, so eq. (12.152) 
can be rewritten as 


dr C — Areg 3Mmac dt rad 


+ D r [ta(T)|pav-. (12-156) 


To make contact with the discussion of Section 12.3.3, we consider the 
first non-vanishing contributions in the non-relativistic limit. Equa- 
tions (12.153) and (12.154) give 


0 1 2q? a 2 1 

FSchott = Anrep 304 (Va'Va F vi) + O 6 ; (12.157) 
i 2¢? 1 
Fa = = a y 4O 12.158 
rad 4reo 3c4 Va t (=) ; ( ) 
and 
1 264 , 1 
Fschott = TEN 303 Va + O (=) ; (12.159) 
1 

Fra = O (=) : (12.160) 


Therefore, to lowest order in v/c, the spatial component of eq. (12.152) 


becomes i TEE 
Pa da Š 
d 4TEo 3e3 V2 + Bent, an 


and reproduces the Abraham-—Lorentz equation (12.151). Note that, in 
this equation, to lowest order, only the Schott term contributes. For the 
temporal component, writing gaVa = da, we get 

dEa 1 2 


_ Dau ao ia 
dt — Ameo 38 |(di-di = di) = å] + Foxt'Va - (12.162) 
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Inside the bracket, the two terms d2 cancel, giving back the first line of 
eq. (12.136) (with d-d replaced by da'da, and all the terms involving 
the other particles reabsorbed in the work Fext'Va made by the external 
force acting on the a-th particle). However, we see that the two sep- 
arate contribution from FO pote and F° q correspond to the separation 
made in the second line of eq. (12.136). In particular, FÌ pott is a total 
time derivative, and produces the term that we denoted as Ey.5py in 
eq. (12.140). 

Until this point, everything appears to work very nicely. Pushing 
the PN expansion up to 1.5PN order, in Section 12.3.3 we have found 
a radiation-reaction force that describes the loss of energy to electro- 
magnetic waves, as predicted by the Larmor formula. In this section, we 
have found a covariantization of this result, that has provided an (appar- 
ently) exact equation, the ALD equation, that describes the self-force to 
all orders in v/c, and we have seen that it includes a radiation-reaction 
force that reproduces the loss of energy given by the exact relativistic 
formula (10.169), plus conservative terms that generalize to all orders 
the 1.5PN contribution to the energy, Fi.5pn, found in eq. (12.140). 
Further examination of the ALD equation, however, reveals an apparent 
pathology. The problem is already present in the non-relativistic limit 
given by the Abraham—Lorentz equation, so let us discuss it first in this 
simpler setting. Introducing the timescale 


1 2q? 
= la 12.163 
í 4Treo 3m,c3 ( ) 
we can rewrite eq. (12.151) as 
dva Va 
Ma ( dt = n) = Eszt . (12.164) 


This equation is of second order in vq(t), i.e., of third order in xa(t) 
and therefore, to be solved, requires initial conditions on the position, 
velocity and acceleration. This is already in contrast with the normal 
situation in classical mechanics. Furthermore, consider this equation 
when Fext = 0. Then, eq. (12.164) has the obvious solution dv,/dt = 
0, as we aspect for a free particle not subject to any external force. 
However, it is also satisfied if the acceleration a, = dv,/dt satisfies 


ee (12.165) 


and this, beside aa = 0, also has the exponentially growing solution 
aalt) = aa (0)et/™ . (12.166) 


This solution describes a charged particle that accelerates even in the 
absence of an external force, and this acceleration even grows exponen- 
tially in time. Such a “self-accelerating” solution is obviously unphysical. 
One could simply exclude such solutions by hand (or, e.g., imposing the 
boundary condition that the acceleration does not diverge as t — oo). 
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However, it is possible to modify directly the Abraham—Lorentz equation 
so that this solution simply does not appear. To this purpose, we recall 
from eq. (5.32) that, for an electron with charge q = —e, the quantity 


1 e? 


To = ~——— 
ATE Mec? 


(12.167) 


is called the classical electron radius. Numerically, its value is rọ œ% 
2.8 x 10715 m, and therefore it represents a scale typical of the realm 
of elementary particle physics. The timescale Tą introduced above, for 
an electron, becomes 7 = (2/3)(ro/c). It is therefore of order of the 
time that light takes to travel across such a small length-scale scale and, 
numerically, is of order 6 x 10774 s. The second term in parenthesis in 
eq. (12.164) is comparable to the first only if the typical time-scale over 
which the velocity changes in time is of order Ta, otherwise it is much 
smaller. Such fast variations, implying relativistic speeds over subatomic 
distances, necessarily belong to the domain of relativistic quantum me- 
chanics and quantum field theory. In all situations where a classical 
treatment is justified, the term 7,V, in eq. (12.164) is much smaller 
than v,. We can then use a perturbative approach, analogous to the 
reduction of order discussed in Section 12.2.3: to zero-th order we just 
neglect the radiation reaction term, writing simply maVa = Fext. We 
then use this equation to compute Va, so that MaVa = Bx, and we 
plug it into eq. (12.164). This gives 


dVa dF ext 


a = Fet + Ta (12.168) 


Ma 


Now, if Fext = 0, the only solution is dva/dt = 0. In the range of validity 
of classical electrodynamics, eq. (12.168) is equivalent to eq. (12.164). 

The same problem appears in the full relativistic ALD equation, and 
the same solution applies. To lowest order eq. (12.150) gives 


Matty = Qa Foxi |@a(T)]Ua,o - (12.169) 


Using this to compute üy, we get 


2 


üg = TE (pF EA nana(n) atlas + ng oo Fes [ta (T)]] FER [ta(T) tha, 
(12.170) 
Inserting this into eq. (12.150) and using the definition (12.163), we get 
CE A Ueta,e (12.171) 
da HP Text, o q ext, o pv 
ting ext pe Ua + Mal Cee Ug) (PextUa,v) uk ’ 


where it is understood that Ff and ô, Ffig must be computed on the 
particle world-line. This is called the Landau—Lifshitz form of the ALD 
equation.?” It is the covariant generalization of eq. (12.168), to all orders 


in v/c, and correctly reproduces the fact that w = 0 when FLY = 0. 


27 See Landau and Lifschits (1975), Sec- 
tion 76. 
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12.3.5 Covariant derivation of mass 
renormalization and radiation reaction 


In the previous sections we have seen that, at 1PN order, the self-field 
of a particle produces a divergent contribution, that must be regularized 
and is eventually reabsorbed into a mass renormalization while, at 1.5PN 
order, appears a radiation-reaction force, to which the particle self-field 
gives a finite contribution (that, indeed, we could compute without the 
need of introducing a regularization). We have then seen how to com- 
pute radiation reaction exactly, to all orders in v/c, by covariantizing 
the lowest-order result. In the computation of mass renormalization, we 
used a regularization scheme that breaks Lorentz invariance (and that 
mimics an extended classical electron model), but we recovered Lorentz 
invariance at the level of renormalized quantities, by introducing a coun- 
terterm that is not Lorentz invariant. In this section we show that it is 
possible to deal with the mass renormalization with a regularization that 
respects Lorentz invariance and, by the same computation, to obtain the 
ALD equation from a direct evaluation of the self-field on the particle 
world-line, without appealing to a covariantization of the lowest-order 
result. 

We start from the Lorentz-covariant equation of motion (8.62) for a 
point particle of charge qa, whose world-line is given by x(7), and we 
rewrite it separating explicitly the contribution of the external field and 
of the self-field of the particle, 


du! (rT) 
dt 


Ma,0 (£) = qa FY self [ta (7) |uav(T) + da Fla [£a(T)]ua, (T) . 
(12.172) 
We have also anticipated that the mass that appears on the left-hand 
side of this equation is still a bare mass, with @ the corresponding cutoff, 
and we have stressed that Fi’, and Ffa must be computed on the 


particle world-line. We then focus on the contribution of the self-field, 
which is given by 


Fi self = o" A self = OY AD self 7 (12.173) 
where Af self is the gauge field generated by the a-th particle itself. From 
eq. (10.10), 

1 
Ab self (x) =~ 3 die! Gret (x a ogee) ’ (12.174) 


Eoc? 


where Gret(x — x’) is the retarded Green’s function, that we write in the 
explicitly Lorentz-invariant form (10.41) 


? 


Gret (a — x') = 5 O(a" x”) 5[(a — a’)?]. (12.175) 


The four-current j(a) associated with a point particle was given in 
eq. (8.3), that we rewrite here as 


jE(£) = qa / j cdr! u” (T')6® [x — xa(r’)]. (12.176) 


—cCo 
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Inserting eq. (12.176) in eq. (12.174), we get 


Af seit (2) Sa / dr’ ul(r’) / dz! Gret(x — 2')6 (a! — xa(r’)]. 


EHE 
(12.177) 
If we now carry out the integral over dfx’ using 6 [a’ — za(T')] we get 


co 
(x) = -= dr’ Gretle — za(T lu” (T"), (12.178) 
EQC Loo 

and (as we will see explicitly in the following computation), if we fi- 
nally evaluate this expression in æ” = x#(r), we find a divergence in 
Af self[ta(T)], or in the corresponding result for Fi .i¢[@a(7)], when the 
integration variable 7’ is equal to 7, due to the Dirac delta on the past 
light cone that appears in eq. (12.175). From this point of view, the 
origin of the divergence that we found in Section 12.3.2 is therefore the 
“collision” between the term 6“ [z — a4(r’)| in the current (12.176), 
and the term 6[(x — 2’)?] in the Green’s function. We therefore need 
to regularize eq. (12.177). The computation that we have described in 
Section 12.3.2 was based on the idea of regularizing the current j/(x) 
given in eq. (12.176), replacing 5 [a’ — xq(r')] with a smoothed version 
of the Dirac delta, which, in practice, we did by expanding it in Fourier 
modes and setting to zero the modes with |k| > /é. This regularization 
corresponds to the intuitive idea of an extended classical electron model, 
although we have repeatedly stressed that it must only be considered as 
a regularization scheme, and the limit £ —> 0 must eventually be taken, 
canceling the divergence against an ¢-dependence of the bare parame- 
ters. As we have discussed, this regularization is not Lorentz invariant, 
but Lorentz invariance is recovered for the renormalized quantities by 
introducing two different bare masses for the temporal and spatial com- 
ponents of p” or, equivalently, by adding to the bare action a counterterm 
which is not Lorentz invariant (and, in fact, is simply proportional to 
the non-relativistic action of a free particle). 

An alternative procedure, that we will follow in this section, consists 
in regularizing the retarded Green’s function in eq. (12.175), rather than 
the current j“(a). As we will see, this can be done preserving explicitly 
Lorentz invariance for the divergent part, so that a single bare mass 
term will be sufficient to reabsorb the divergences in the spatial and in 
the temporal components of the equation of motion. 

Using eq. (12.178) for A’ ....(x), with the retarded Green’s function 


a,self 
suitably regularized, we write 


F” (a) = = 8 dr! {O" Grele — alr Jut (T) 
; €0€ J 66 


—0" Gretl — ta(7’)}ub(r’) } (12.179) 


=i a [ui = Put] darale — ral’) 
€0C J- 
qa va v a = 

ae ae ee Ji dr! Ua,o(T')OsGret [a — a(7')) . 
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Then, evaluating F” p(x) on the particle world-line, 


q 
te) = = (PNA — nP nte) 


x / dt’ Ua,o(T’) (OgGret[z — La(7"))) swe (r) . (12.180) 


After evaluating this expression explicitly, with a given regularization, 
we will insert it in eq. (12.172) and, as we will see, we will obtain a term 
which diverges when we remove the regularization, which is reabsorbed 
in a mass renormalization, and a finite contribution that gives the self- 
force term of the ALD equation (12.149). 

To regularize Gret, it is convenient to introduce first the symmetric 
and antisymmetric combinations of the retarded and advanced Green’s 


functions, 
Gs(t) = i [Gret (£) + Gaav(x)] , (12.181) 
Ga(t) = : [Gret (£) — Gaav(x)] , (12.182) 
so 
Gret(£) = Gs(x) + Ga(z). (12.183) 


Using the covariant expressions (10.40) for the retarded and advanced 
Green’s functions, we see that 


Gs(z) = —— ô(z°). (12.184) 


Since this quantity depends only on the Lorentz-invariant variable x”, it 
is easy to regularize it in a Lorentz-invariant manner, simply replacing 
the Dirac delta by any of its regularizations discussed in Section 1.4, so 
we replace eq. (12.184) by 

1 2 

Gs(x) = —— dreg (2). (12.185) 

4T 
We will use, for definiteness, the gaussian regularization (1.55), that we 
rewrite here in the notation 


1 
V2T 4 
with a a regularization parameter to be eventually sent to zero. 


For the antisymmetric combination, a proper treatment involves some 
subtlety. Using eq. (10.40), 


Sreg (2) = e77 / Qa), (12.186) 


1 


E 


Ga(z) = (x2°)6(2?) , (12.187) 
where e(xo) = (x?) — 0(—2°). However, the product of a distribution 
such as 0(@°), or e(x°), and a distribution such as 6(x?) = 6[(x°)? —|x|?], 
is ill defined on the tip of the light cone, where |x| = 0. As will be clear 
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from the following computation, this is exactly the region from where 
all the result for the self-field comes, and a naive computation using 
eq. (12.187), even with d(x?) replaced by dreg(x*), would eventually lead 
to ill-defined expressions (as we will see in Note 30 on page 341 and in 
Note 33 on page 343). In order to deal with this expression in a clean 
way, we then proceed as follows. We begin by recalling the identity, 
valid in the sense of distributions, 


lim = P- F ind(a), (12.188) 


«30+ © tE x 


where the symbol P denotes the principal part. We now introduce a 
time-like four-vector n” = €(1,0,0,0), with e > 0*, and we consider the 
combination z” — in = (x? — ie,x). Then, to O(c), 


(x—in)?®? = —(x2°—ie)? +x? 
x” + Qiex®. (12.189) 


Since € > 0, 2ex° is an infinitesimal quantity whose sign is the same as 
that of 2°. Therefore, from eq. (12.188), 


1 1 
li = ime(x°)5 (a? . 
: im, Tem Po ime(a )d(a~) , (12.190) 


where the limit 7 — O* is defined by sending € > 0* in n” = e(1, 0,0,0), 
and e(x?) (not to be confused with the infinitesimal constant e; both 
notations are standard and we do not change either of them) is the sign of 
©, i.e., (2°) = 6(x°)—0(—2°). Similarly, to O(€), (1 +in)? = 2?—2iex®, 
and therefore 


1 = b Face O\ Ste? 
ats (x + in)? =i + ime(a O(a"). (12.191) 


Taking the difference of eqs. (12.191) and (12.190), the principal part 
cancels and we get 


1 
r 
noor [@ tin? (im? 


| = ime(x°) d(x?) . (12.192) 


Therefore, the combination ¢(x°)6(x?) has the representation 


de ii | f : I (12.193) 


im 
2T not |(a+in)? (a — in)? 


We will then use this expression, with finite 7, as a regularization of the 
whole combination ¢(x°)d(x?), defining 


} 1 il 
rsr?) =-= 12.194 
[e(x ) (x reg On ae a_i | , ( ) 
and we will then take the limit n > 0* only at the end of the computa- 
tion. Note that this regularization formally breaks Lorentz invariance, 
since, if the four-vector n” has the form 7 = e(1,0,0,0) in a reference 


338 Post-Newtonian expansion and radiation reaction 


28 Observe that we are using two dif- 
ferent regulators for Gg and for Gy. 
This is justified because the computa- 
tion of the contribution from Gs and 
from G4 are completely independent, 
and the former will turn out to be di- 
vergent, while the latter will be finite 
when we send the regulator to zero. 


frame, it will have a different form in a boosted frame. Therefore, the 
previous computation implicitly assumes a given reference frame. How- 
ever, in the limit €e — 0*, we expect Lorentz invariance to be recovered. 
We will check this point in the following. 

From eq. (12.187), the regularized Green’s function G(x) can then 
be written as 


Galo) = -E [eiad] 
i 1 1 
= 35 — al (12.195) 


We are now ready to carry out the computation.?S We consider first the 
contribution of Gg to Fc ip[ta(T)], defining 


_ da (nne _ n’ Pt) (12.196) 


€9Cc 


Fy slta(7)| 
x J dr! ua a(T’) (OaGsle - 2a(7 nan, > 
where G's is the regularized Green’s function (12.185). From eq. (12.185), 


Gs(x — 2’) is actually a function only of the combination w = (a — 2"), 
so 


BGs -1) = [p(e—2')"] = Gs(w) 
1 n dôreg(w) 
= -> (e-a) E, (12.197) 


and therefore 


(OsGs[z — zalar) = = [ta(r) — ta(t’)]g 


dreg (w) 
x [> locost A (12.198) 


where walr, 7’) = [£a(T) — ta(7’)]?. Then, 


V da Va V ta 
Peg |£a(T)] = EL = Pye ) (12.199) 


[ar taal") tal”) = zal) (Z) eawater 


We now expand the integrand around 7’ = 7. Even if the integration is 
over T’ € [—00, co], we will see in a moment that, when we remove the 
regularization and dreg(w) + 6(w), only a finite number of terms in the 
expansion (in fact, only one term) gives a non-vanishing contribution, so 
the expansion will provide us with the full answer. Writing T’ = 7 +0, 
we have 


1 
Ua,a(T) = Uao(T) + Ota, (T) + 97 tia,a(T) + O(o°), (12.200) 
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1 1 
La B(T +o) — Ta (T) = Tua, g(T) + =o" tig p(T) + gr tael) + O(o*t) ; 


i (12.201) 
Then, 
Ga if Pn" )ua a(7’) [za(T") — (7), 
= lo lua (T)ùa(T) — ua(T)ùa(T)] + Lo [ua (T)üa(T) — ual(T)ü4(T)] 
+O(o*), (12.202) 


where we used u,,u = —c? and uù” = 0 (which follows applying d/dr 
to upu” = —c?). Note that the term linear in ø canceled because it is 
proportional to Ua a(T)ua B(T) which, upon contraction with (ntên”® — 
n’P nt), gives zero. 

To expand the derivative of the Dirac delta in eq. (12.199) we observe, 
from eq. (12.201), that 
20? + O(a"). (12.203) 


WalT, T) = 


Then, apart from term that will give a vanishing contribution when we 
remove the regularization,?® 


döres (0) \ | 5 = (Mees) as 
aw w=Wa(T,T’) dw w=—c?o? + 


We now introduce z = —w/c? and we use the properties 6(—z) = 6(z) 
and 6(c?z) = (1/c?)d(z) that, apart at most for terms that vanish as 
we remove the regularization, also hold for the regularized Dirac delta 
[independently of the specific choice (12.186)]. We then get 


dregs) 1 (abel) 
dw w=Wa(T,T’) c dz z=0?° + 


Since 7 is fixed, in eq. (12.199) the integral in dr’ is the same as an inte- 
gral in do. Then, inserting eqs. (12.202) and (12.205) into eq. (12.199), 
we are left with the evaluation of integrals of the form 


An =| dog” (Sx) b=; 
a dz 


for n > 2. For n odd, these integrals vanish by parity since dd;eg(z)/dz is 
evaluated in z = g?, so at a point that does not change under o > —o. 
For n even, we write 


An =2 [ dao” (22) a 
0 dz 


since the integrand is even in ø. It is now necessary to carry out the in- 
tegral using an explicit regularization of the Dirac delta and remove the 
regularization only afterwards. If, instead, we removed the regulariza- 
tion inside the integral, we would end up with a derivative of the Dirac 


(12.204) 


(12.205) 


(12.206) 


(12.207) 


29 The vanishing of these terms is a 
consequence of the property (1.60) of 
the Dirac delta, so that, for instance 
lx + x?) = 6(x)/|1 + x|, which is 
the same as (x). For the regularized 
Dirac delta, we can write dreg (x +a?) = 
dreg(x)/[1 + O(x)]. One can then check 
the effect of these corrections on the fi- 
nal result, writing 


dobre 
(2) lw=—c2o240(04) 
dw 


~ ey) | 
1+ O(e2) \ dw wae es 


Then, repeating the computation that 
we will carry out below, in eq. (12.208), 
one would find that o”*+? is replaced by 
o”+2/11 + O(c?)], and, in the second 
line of eq. (12.208), we would get 


= 3)/2 =f # yl(rt1)/2 


1+cau + cau 
for some constant c1. 


—u?/2 

= 
We would then 
confirm that, in the limit a — 0, all 
terms with n > 4 are vanishing, and 
the term n = 2 still gives the same 
divergent contribution proportional to 
a~1/2, and even the same finite part, 
since a~!/2[1 + O(a)] goes to a7!/?, 
Analogous considerations hold for the 
manipulations leading to eq. (12.205). 
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delta evaluated at the boundary of the integration domain, which is an 
ill-defined quantity. Using, for definiteness, the regularization (12.186), 


2 e! 4 2 
2 | do o”+2e-0*/ (20°) 
V 27 a3 0 


1 °° 2 
= ~~ tg (n-3)/2 duyt)/2—-u?/2 12.208 
a UU € y % 
V 2T f ( ) 


where, in the second line, we introduced u = g?/a. We now observe 
that, among the values of n even and such that n > 2, in the limit 
a — 0, An diverges for n = 2, while it vanishes for all n > 4. We also 
note that the regularization parameter a has the same dimensions as 
z = —(x — 2’)*/c*, so is a length squared over c?. To make contact 
with our previous notation, we then write it as a = ¢?/c?, where £ is 
a regularization parameter with dimensions of length, so removing the 
regularization corresponds to sending £ — 0. Then, from eq. (12.208), 
Ag = agc/t, where az is a dimensionless constant (whose precise value, 
in this case, is ag = —2'/41(5/4)/./27, but has no special meaning since 
it depends on the regularization used). 

Therefore, since only n = 2 contributes, the result for FH% [xa(T)], to 
all orders in the expansion in øg, is 


Filen) = gea g LEG) -ulil | (12.209) 


apart from terms that vanish as £ > 0. We now insert this into the 
equation of motion (12.172), and use the identities u¥(T)ua (T) = —c? 
and ù% (T)ua, (T) = 0. Then, we get 


duh (T) q? az du” 
3 = = a a P ing a a,v 
ma,oll)— 7> m ao Hy Bal) tia wT) 
+4aFext[Za(T)|Uav(7) - (12.210) 


We can rewrite this as 


q az | du” 


™ma,o(€) 4reoc? L | dr 


qa Fg A [ta(T)|Ua,v(7) 
+qa Fha [ea(T)]Uav(T) (12.211) 


and we see that the divergence is reabsorbed in a renormalization of the 
mass, 


(0) ga a2 (12.212) 
Ma = Ma, = y’ ` 
a 4reoc? £ 
so that we simply have 
dp” HvV Hv 
= GaP lea(T)]uav (T) + GaP eile (TMa(T), (12-213) 


with p” = mau”. Note that, having preserved Lorentz invariance in the 
regularization, the spatial and temporal components of p” renormalize 
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in the same way, and the divergence is reabsorbed into a single bare 
mass term. 

We next turn our attention to the contribution from Fy", which de- 
pends on the anti-symmetric combination (12.187) of the retarded and 


advanced Green’s functions. We define 


v da va v a 
Fe la(7)] = E (PNE — n¥P nh) (12.214) 
x J dr’ ua al T") (38G ale — TalT') arat) i 
where G4 is the regularized Green’s function (12.195). Then 
i | (@+ine _ (x-in) 
OgG = 12.215 
pale) = -aa [erat e nP oe 
We now observe that, as in eq. (12.189), to first order in € 
(e-i)? = (1? + Dex? 
= g’(x? + diex®) 
= g°(x —2in)?, (12.216) 


and similarly for [(z + iņ)?]?. Then, apart from terms O(n) that will 
eventually give no contribution when we eventually take the limit n — 07 
(and redefining 7 > 7/2, given that anyhow it is a parameter used only 
to take the limit 7 > 07), 


a TB 1 1 
An? x? | (a + in)? 


OsGa(x) = (12.217) 


Comparing with eq. (12.194), we see that 


[e(x°)d(x?)] (12.218) 


reg ` 


eee = 2" cea). 


12.221 
Qn x? ( ) 


However, we still need to keep 0gG, in the regularized form (12.217), 
in order to properly carry out the integral over T’ in eq. (12.214). In 
eq. (12.214) we need gG a(x) evaluated in x = q(T) — ta(7’), that we 
prefer to write as —[%q(7’) — £a(T)]. Equation (12.214) then becomes 


pv a “i 
Fi alta(t)] = ETE (12.222) 
x f dr (ne — ®t Jua a(t) [tal ") = tal] 

1 1 1 


30 Observe that, if one removes the 
regularization already in eq. (12.195), 
writing formally 


Ga(x) = (x°)6(a?), 


——eE 
T 

and attempts to compute 0gG4 from 
this expression, one is confronted with 
ill-defined expressions, and it is not pos- 
sible to get eq. (12.221) in any clean 
way. A regularization of the full com- 
bination ¢(2°)6(a?) is necessary to per- 
form a proper computation. 

As a check of eq. (12.218) observe, from 
eq. (12.195), that it can be rewritten as 


2x 
ƏsGa =- EGA. (12.219) 
x 
Then, using af xg = 4, 
8G xÊ 
B = A 
ð OgGA = t Taye GA 
TB zx 
=a (2S) 
Ga 4Ga 4GA 
a) t 7) z2 = 0; 
(12.220) 
Therefore OG 4 = 0, as it should, since 


Ga is the difference of two Green’s 
functions. Note that the above manip- 
ulations become ill-defined at x? = 0. 
However, the validity of OGA = 0 for 
all values of z”, including when x? = 0, 
can be proven using the regularized ex- 
pression (12.215), and sending 7 —> 0+ 
only at the end. 
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We now write T’ = T +0 and we expand in powers of o. The expansion 
of the term in the second line is given by eq. (12.202). To expand the 
term in the third line, we begin by observing that, working up to O(c?) 


[a(7’) = a(t) = in? = (oualT) = in)? 
= =e (o? — 2ieou?/2). (12.223) 


For a physical trajectory u? = dx°/dr > 0, and therefore, defining 
e = 2eu°/c? (so that e + 0* corresponds to e — 0*), and finally 
renaming €’ as €, 


= —— [e(o)d(07)] ae) (12.224) 


where, with the same steps leading to eq. (12.194), 


2m |o? — ieo o%+ie0 


COUGO Ai ' : | (12.225) 


Finally, [£a (T) — £a(T)]? = —c20? + O(o*). Then, eq. (12.222) becomes 
v da + a 1 “py v . 
Fi leal) = -5 gia | _ do 50? lulja (r) = lil) 


+508 [ult(r)iie (7) — ugli (r)] + O(04)} 


2i 1 2 
REETIS [e(a)d(o Nise 
ga 1 f” 


= eza | _ do [ASO] g { Eu (7 A 7) = ulr atl] 


+50 lut (7 )ii’ (r) — u(r) ii" (7)] + O(0?)} . (12.226) 


We are therefore led to evaluate the integrals 


B,= I doo” [e(a)d(07)| es (12.227) 
31This can be shown writing êlo?) = —oo 
6(c)/(2|c|) and using «(o)|o| = o. 
Tiemdonn oddani z3; with n > 0. From eq. (12.225), [e(o)d(07)] reg 8 an odd function of 
Ban = ST doo”-14(c), (12.228) g, and therefore, B, vanishes for n even. For n odd and n > 3 we can 


remove the regularization, replacing [e(o)ð(o°)] e by e(o)ô(o?), and we 


and all these integral vanish: get B, = 0.3! Therefore, the only finite contribution comes from n = 1, 
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and is given by 


Bı 
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E i a d 2ico 
~ mo °° iF eo? 
= 2 J > : (12.229) 
~ Ae 0 +e : 
Writing o = et, € cancels and 
ae oe 1 
Ras f da 
T 0 t? + 1 
= 1. (12.230) 
Therefore, eq. (12.226) gives the exact result for Fi", [xa(7)], 
FY fea(r)] = L 4? ut(rjät (r) — ue (ryae()]. | (12.231) 
are Ategc ct 3° % s j i l f 


Note that this result is finite, contrary to F#⁄ [£a(T)] that was divergent, 
see eq. (12.209). Also observe that the result is fully covariant for the fi- 
nite part as well, so, even if our regularization (12.194) breaks Lorentz in- 
variance through the introduction of a fixed four-vector n” = e(1,0,0,0), 
Lorentz invariance is recovered as e > 0*, i.e., when the regularization 
is removed.3:33 We now plug this into eq. (12.210) and we observe that, 


from u”u, = —c? it follow, by taking a derivative with respect to 7, that 

u” ù, = 0 and, taking one more derivative, u”ti, = ~ù” ty = —w?. Then 
v da 2 |. ù? (T) 

Fi alta(T)|Uav(7) = eae ater) -=u (r)|, (12.232) 


and eq. (12.213) becomes 


L 2 
ae ae oe [we 


ae P utilo) + an FET) uo (0). 


(12.233 
We have therefore recovered the ALD equation (12.149). The derivation 
of this section, although technically more involved, has the advantage 
of unifying the results of the previous sections in a single, conceptually 
clean, framework. Mass renormalization comes from the symmetric com- 
bination of the advanced and retarded Green’s functions, which gives a 
divergent contribution, so it requires a regularization (that, for the sym- 
metric combination of Green’s functions, can be performed in a fully 
Lorentz-invariant manner), while the self-force term in the ALD equa- 
tion is due the anti-symmetric combination of the advanced and retarded 
Green’s functions, and is a finite contribution. 


32-This is an unavoidable consequence 
of the fact that the result that we have 
found for FA [va(T)] is finite. In quan- 
tum field theory, there are situations, 
known as anomalies, where a symme- 
try broken by the regularization is not 
recovered when the regularization is re- 
moved, but this only happens when fac- 
tors of order e describing the correction 
to the result when the cutoff is removed 
combine with terms of order 1/e from 
divergences, to produce a finite result. 


33t is also interesting to observe that, 
if we had not been careful to regu- 
larize the product e(2°)6(#?), instead 
of eq. (12.229) we would have now 
found the formal expression By = 
[2 doo €(c)d(o?). In the literature, 
this has been manipulated writing it as 
Bı = 2 f doo (07), since the inte- 
grand is even and, for ø > 0, e(o) = 
1. For ø > 0 we also have ô(o?) = 
(1/20)6(c), so we end up with By = 
Jo do 6(c). However, this integral is 
ill-defined, and has been typically dealt 
with by introducing a prescription on 
(o) only at this stage, such as replac- 
ing it with 6(o — €), with € > 0, so that 
the integral becomes equal to one. At 
this level, such a prescription is com- 
pletely arbitrary and is only motivated 
by the desire to obtain the ALD equa- 
tion with the correct coefficient 2/3, 
that we already know from the deriva- 
tion performed using the the covari- 
antization (or from the non-relativistic 
limit (12.126), where it is just the factor 
2/3 that appears in Larmor’s formula). 
Our computation shows how the cor- 
rect prescription emerges automatically 
by regularizing the product ¢(«°)d(x?) 
and always working with regularized 
quantities. 


Electromagnetic fields in 
material media 


Until now we have studied the equations of electromagnetism “in vac- 
uum,” in the sense that the source terms, when they were not absent al- 
together (as when we studied the propagation of electromagnetic waves), 
were due to localized charge distributions, often even idealized as point 
charges, and we were interested in the electromagnetic field outside the 
sources. We now begin our study of electromagnetic fields in materials. 
At a microscopic level, the electromagnetic field has large variations, for 
instance near atoms in a solid, and it depends on a myriad of details. 
However, to understand the behavior of a material on larger scales, we 
do not need to know all these short-distance details. For instance, in a 
typical solid, there will be large fluctuations of the electromagnetic field 
at the atomic scale, i.e., at the scale of the Bohr radius, rg ~ 1078 cm. 
However, we are typically interested in the collective behavior of the 
system on a macroscopic scale, say, for instance, 1 cm. We can then 
choose a scale L intermediate between the atomic and the macroscopic 
scale, for instance L ~ 100rg ~ 107° cm, and average the electric and 
magnetic fields over such a scale. This results in “smoothed-out” fields, 
that govern the behavior of the material at such larger scales. 

This is a standard approach in physics: in general, it would be im- 
possible to describe any physical system if we needed to know all details 
of what happens at short-distance scales. Furthermore, as we go to 
shorter and shorter scales, we encounter new phenomena and new phys- 
ical laws; for instance, at some point, going toward the atomic scale, 
classical physics must give way to quantum mechanics, and if we look 
even closer and closer into atoms and nuclei we enter the regime of rel- 
ativistic quantum field theory, we discover new interactions, and so on. 
The basic approach is therefore to develop an effective theory, valid at 
the length-scales at which we are interested, in which all short-distance 
details are taken into account in an “average” manner. Another impor- 
tant aspect is that our ignorance of the physical laws at short scales 
should be encoded into an effective description, involving a few phe- 
nomenological parameters that can be fixed by comparison with exper- 
iments; it is true that, for describing material media, we cannot really 
dispense with quantum mechanics (we have seen for instance in Prob- 
lem 10.2 that a classical model of the atom miserably fails, collapsing 
in about 1071! s because of emission of electromagnetic radiation); still, 
at least at a first level of analysis, we do not want to enter into all the 


13.1 


13.2 


13.3 


13.4 


13.5 


13.6 
13.7 
13.8 


for 
346 


Maxwell’s equations 
macroscopic fields 


The macroscopic charge 
density: free and bound 
charges 348 


The macroscopic current 
density 349 


Maxwell’s equations in mate- 
rial media 351 


Boundary conditions on E, 


B, D, H 353 
Constitutive relations 356 
Energy conservation 360 
Solved problems 361 


346 Electromagnetic fields in material media 


details of the quantum-mechanical interactions at short distances, and 
we want to have an effective classical description of the property of a 
material. 


13.1 Maxwell’s equations for macroscopic 
fields 


As a first step let us see how, from the fundamental, “microscopic”, 
Maxwell’s equations, we can derive equations governing the behavior of 
smoothed fields. We denote by Emicro(t, x) and Bynicro(t, x) the micro- 
scopic electric and magnetic field, respectively, and by pmicro(t, x) and 
Jmicro(t, x) the microscopic charge and current densities. In this nota- 
tion, Maxwell’s equations (3.8)—(3.11) read 


1 


V-Enicro = —Prynicro 5 (13.1) 
€0 

1 OEn; 
vV Bmicro -n = Jmicro , 13.2 
x ae Hoj (13.2) 
V -Bmicro = 0 , (13.3) 

OBnnicro 

Vv E micro = 0. 13.4 
x + öt (13.4) 


We now smooth out these fields, performing a spatial average with the 
help of a smoothing function s(x). The smoothed electric and magnetic 
fields, that we denote by E and B, are defined by 


E(t,x) = J dèa! s(x — x')Emicro(t, X’), (13.5) 
B(t,x) = fèrs- X )Binicro(t, xX): (13.6) 


The exact form of the function s(x— x’) is not important; the important 
point is that it is approximately constant for |x — x’| smaller than the 
smoothing scale L, vanishes quickly for |x — x’| > L, and is a smooth 
function on the scale L. We normalize it by 


jaa s(x) =1, (13.7) 


so that eqs. (13.5) and (13.6) represent an average of the microscopic 
fields over a region of linear size of order L, centered around the point 
x. Note that, in terms of the spatial Fourier transform, eq. (13.5) reads 


E(t, k) = §(k)Emicro(t, k) . (13.8) 


The fact that s(x) is a smooth function of the spatial scale L means that 
its Fourier modes with |k| >> 1/L are very small. Therefore, another way 
of thinking to this smoothing procedure is to say that the role of §(k) is 
to suppress the Fourier modes with large |k| in E(t,k). 
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We similarly define the smoothed out, or macroscopic, charge and 
current density, 
p(t, x) 


= jae s(x = x’) pmicro(t, x’) , (13.9) 


= Ja s(x _ x’ )jmicro(E, x’) $ 


It is now straightforward to derive the equations satisfied by the macro- 
scopic fields. Consider for instance eq. (13.1) and apply the smoothing 
to both sides, 


j(t,x) (13.10) 


1 
[ae s(x — x')V x Emicro(t, x’) = — far s(x — x’) pmicro(t, x’). 
€0 
(13.11) 
The integral on the right-hand side is just the macroscopic charge density 
p(t, x). On the left-hand side, integrating by parts, we write 


[ata'atx — x')V x- Emicro(t, x’) = -fë [Vx s(x — x’)]-Emicro(t, x’) 


Vx: J dx! s(x _ x')Emicro(t, x) 
= V,-E(t,x). (13.12) 


In other words, the averaging procedure commutes with the operation 
of taking the divergence. We therefore get the macroscopic Maxwell 
equation Vx-E(t,x) = (1/eo)p(t,x). The same manipulations can be 
performed on all other equations, so we finally find the full set of macro- 
scopic Maxwell’s equations 


1 
VE = p, (13.13) 
€0 
1 OE : 
VB = 0, (13.15) 
B 
VxE+S = 0. (13.16) 


The result looks extremely simple: the macroscopic electric and mag- 
netic field satisfy the same Maxwell’s equations as the microscopic fields, 
with the microscopic charge and current densities replaced by their 
macroscopic counterparts.! The simplicity of this result, however, is 
deceptive. In reality, to use these equations, we must have a model 
for the macroscopic charge and current densities, and this is where the 
real difficulty resides. The huge variety and complexity of materials in 
condensed matter physics emerges from the variety of behaviors of the 
macroscopic charge and current densities. A detailed study belongs to a 
course of condensed matter physics. However, a broad classification of 
the simplest situations is possible, and will be discussed in the following 
sections. 


TAs a historical note, this averaging 
procedure is due to Lorentz. How- 
ever, his path was in the reverse or- 
der. He started from Maxwell’s equa- 
tions, as they had been empirically 
established at the macroscopic level, 
and realized that the theory could 
also be applied to the microscopic do- 
main, with the macroscopic equations 
derived, through this averaging proce- 
dure, from the more fundamental mi- 
croscopic equations. 
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2 Observe that eq. (11.152) is valid also 
in the near outer region, i.e., as long as 
we are outside the source. Therefore, 
given a small volume da’ < L3, which 
contains an electric dipole P(t, x’)d?2’, 
it can be used to compute the field out- 
side it and, in the limit in which d?z’ is 
actually infinitesimal, it gives the cor- 
rect result everywhere. Note that, by 
construction, P(t, x’) is smooth already 
on a scale L, much larger than the 
linear size of the infinitesimal volume 
dx’. 


13.2 The macroscopic charge density: free 
and bound charges 


We begin with the macroscopic charge density. A first useful distinction, 
for the electric charges in a material, is between those that are free to 
move and those that are bound to atoms or molecules. In particular, in a 
metal, some of the electrons are free to move, and this is what gives them 
their conduction properties. In insulators, free charges can be externally 
added (by “doping” them), so also for them this distinction is useful. 

Let us consider the effect of bound charges in a simple case such an en- 
semble of molecules, where the interaction between individual molecules 
is weak. Each molecule is electrically neutral; however, its electric dipole 
moment will in general be non-vanishing. We denote by P the elec- 
tric dipole moment per unit volume, obtained averaging the individual 
dipoles of the molecules over a given macroscopic volume (i.e., a vol- 
ume with a linear size L much larger than the atomic scale, but small 
compared to the scales at which we will measure the smoothed fields; 
e.g., L~ 100rg ~ 1076 cm as in the discussion above). The vector P is 
known as the polarization vector. In general, P is a function of the point 
x around which we take a spatial average and, because of the smoothing 
implied by the spatial average, it only varies significantly on the scale 
L. From the multipole expansion (11.152), we see that a dipole moment 
d(t) located at the point x’ generates in the point x a potential 


= 1 d(t)-(x—x’) 
4TEo f 


o(t, x) (13.17) 


|x — x|’ 
Therefore, a distribution of electric dipoles with electric dipole moment 
density P(t,x) within a material body of finite volume V generates a 
potential? 


1 P(t, x’)-(x — x’) 
t = da aau N a, 13.1 
oxy = ge [te (13.18) 
We now observe, from eq. (4.19), that 
x—x’ 1 
—_ = Vy, ———__.. 13.19 
|x — x|’ |x — x’ ( ) 
Then, integrating by parts, 
1 1 
t = Px P(t, x) Vx ——— ; 
Hex) = g f de Px) Ww a (13.20) 


= 1 | dx Vx-P(t, x’) A 1 f gpg EEX?) ; 
Aten Jy Ateg Joy |x — x’| 
Comparison with the second line of eq. (11.151) shows that a spatially- 


varying polarization vector generates a near-region field equal to that 
produced by a charge density 


[x — x’| 


Ppol(t,X) = —V-P(t,x), (13.21) 
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as well as a surface charge density, on the boundary of the material, 
given by 


Opol = Nig P(t, x), (13.22) 


where ñ, is the outer normal to the surface element d?s. Since this 
charge density arises from electric dipoles, each of which is electrically 
neutral, if we integrate this total charge over a volume V, we must find 
that the net charge inside the volume plus that on the surface is zero. 
Indeed, using Gauss theorem, 


[Be palt) = -f da V-P(t,x) 
V V 


-f ds ûs P(t,x), 
oV 


(13.23) 


and this is precisely minus the surface integral of the surface charge 
(13.22). These volume and surface charges are present, despite the fact 
that each dipole is neutral, because of the different way in which the 
dipoles arrange themselves spatially, as in the schematic example shown 
in Fig. 13.1. 


13.3 The macroscopic current density 


A similar analysis can be performed for the current density. In the 
simplest modelization, one writes 
j = Jree oi Jpol + jmag : (13.24) 
The term jrree is the current carried by the electrons and ions that are 
not bound to each other and move under the action of electromagnetic 
fields, whose density is given by pfree. It satisfies a separate conservation 
equation 
OP tree 


or V free = 0, 
at + V de 


that expresses the fact that the variation in time of the density of free 
charges at a point x, or, more precisely, in a small volume centered at 
x, is given by the flux of free charges flowing through it. In solids, jfree 
is relevant in conductors and semiconductors. 

The second term in eq. (13.24) is the current generated by a time- 
dependent polarization density. According to eq. (13.21), 


(13.25) 


OPpol OP 
=-—V:-—— 13.2 
Ot x Ot’ [aiak] 
Therefore, defining a current density 
, oP (t,x 
Jpoi(t, x) = 2 (13.27) 


a) 
>—® 
— 0 


Fig. 13.1 A schematic example of 
a distribution of dipoles resulting 
in a volume charge density and a 
surface charge density. The dotted 
squares correspond to the volumes 
of size L used to smooth out the 
charge distribution. The gradient of 
the dipole distribution (here repre- 
sented by the variation of the num- 
ber of dipoles straddling through ad- 
jacent volumes in the vertical direc- 
tion) induces a net charge in the vol- 
umes in the interior of the mate- 
rial (the sum over the net charges 
in each horizontal block of cell is 
marked explicitly), which is com- 
pensated by the surface charge. If 
V-P = 0 (which, in the above pic- 
ture, can be obtained if a constant 
number of dipoles straddles across 
two neighboring vertical cells), then 
the charges inside the volume com- 
pensate exactly inside each cell, and 
we only have a surface charge, that 
integrates to zero. 


3At the mathematical level, eq. (13.25) 
is a local equation, so one could take 
an infinitesimally small volume. Recall, 
however, that the whole macroscopic 
description implies a coarse graining 
over volumes larger that the atomic 
size, say over regions of size L ~ 10?rp, 
as discussed at the beginning of this 
chapter. Therefore, equations such as 
(13.25) and all similar equations be- 
low, must always be understood as 
smoothed over such small volumes. 
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4The integration by parts is performed 
more clearly working in components 
and observing that, for any differen- 
tiable function f(x), 


[Be einMyans 
Vv 
=- f dx cij (Or Mi) f 
V 
+f B sp €ijhM; f 
ov 
= L da e:jn(O; Me) f 


+f d?’skeijr Mj f .(13.30) 
ov 


we also have the separate conservation equation 


“Pr + Veipot = 0. 
Physically, jpo] describes the fact that the configuration of dipoles can 
evolve in time, and then the charge density associated with it, as the one 
shown in Fig. 13.1, evolves with time and therefore generates a current. 
The third term, called the magnetization current, is due to the density 
of magnetic dipoles in the materials, which, at the microscopic level, can 
be due either to the motion of the electrons in atoms, or to the magnetic 
moments associated with the spin of the electrons and of the nuclei. Its 
treatment is analogous to that of the density of electric dipole in the 
previous subsection. To compute the vector potential generated in the 
near zone by a magnetic dipole at the origin we use eq. (11.154). Then, 
the vector potential generated in the near zone by a magnetic dipole 
density M(t,x) (also called the magnetization) in a material of finite 
volume V is given by 


(13.28) 


plo 3_, M(t,x’) x (x — x’) 
A(t, x) = ae fa a! Ix — x’ 
Ho 3,7 / 1 
= — J] dr M(t x ——. 13.2 
=f x M(t,x’) x V Rox (13.29) 


Integrating by parts, we gett 


A(t, x) 


= Ho J Br! Vx: x M(t, x’) of Lo 1 ds! M(t, x’) x ni, 
4r Jy |x — x’| 4r Jay |x — x’ 
(13.31) 


Comparing with eq. (11.153) we see that a magnetic dipole density is 
effectively equivalent to a magnetization current 


Jmag (t,x) = V x M(t, x), (13.32) 
and a surface current density 
Kinag (t,x) = M(t, x) xf , (13.33) 


where, again, n, is the outer normal to the surface element d?s. Observe 
that, being a curl, 


V-jmag = 0, (13.34) 


so there is no time-varying charge associated with it in a conservation 
equation. This is a consequence of the fact that Gauss’ law (3.3) implies 
that there are no magnetic charges, SO jmag can only be generated by 
magnetic dipoles. 

A possible extension of the simple modelization (13.24) takes into 
account the effect of convection, i.e., the transport of charges in gases 
and liquids through bulk motions of the fluid. Examples of situations 
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where this takes places are provided by molten metals and plasmas, by 
convective motions in the interior of stars, or by clouds carrying free 
electrons that move through the atmosphere under the action of winds. 
In general, if p(t,x) is the density of a fluid (density of mass, or of 
charge) and u(t,x) is its velocity field, the corresponding convective 
current is jeonv(t,x) = p(t,x)u(t,x). Free charges can be transported 
by convective motions, and this gives rise to the convective current 


Jconv, free (t, x) = Pfree (t, x)u(t, x) x (13.35) 


The total flow of free charges is therefore described by the current 


Jtree (t, x) = Jeena teelt x) + Jconv,free (t, x) , (13.36) 


where jcond,free is the contribution from electric conduction, for which 
earlier, in the absence of convection, we used the notation jree(t, x). The 
difference between jeond,free and jconv,free iS that the former is generated 
by free charges moving under the action of electric fields, while the latter 
is due to free charges moving under the action of mechanical forces. The 
continuity equation for the free charges reads? 


OPptree 
Ot 
and expresses the fact that the variation in time of the density of free 
charges in a small volume is given by the flux of free charges flowing 
through it either because of conduction, or because of convection.’ 


+V- [Jcona,free(t, x) + logue free (t, x)] = 0), (13.37) 


13.4 Maxwell’s equations in material 
media 
We now have all the elements for writing Maxwell’s equations in material 


media. We start from Gauss’s law (13.13), and we write the source term 
as p = Pfree + Poi" Then, using eq. (13.21), 


oV- E Pfree + Ppol 
= Ptree—- VP. (13.39) 
It is convenient to define the electric displacement vector D as 
D=cE+P, (13.40) 
so that Gauss’s law can be rewritten as 
V-D = Pree - (13.41) 


We proceed similarly for the Ampére-Maxwell equation (13.14). Using 
eq. (13.24), together with eqs. (13.27) and (13.32), we get 


1 OE OP 
VB- 5 = to (jet FV XM) , 
c? Ot 


a (13.42) 


> As we will see below eq. (13.42), this 
continuity equation is actually a con- 
sequence of Maxwell’s equation, just 
as the conservation equation for free 
charges discussed in Section 3.2.1. 


6 Polarization charges, being bound 
to the molecules, can also be trans- 
ported by convective motions. The 
corresponding convective current 
is Jconv,pol (t; X) = Ppoilt, x) u(t, x), 
and can be further added to the 
right-hand side of the macroscopic 
Ampére—Maxwell equation. Note that 
V-jconv,pol = 9. This can be shown 
observing that the continuity equation 
for the polarization charge now reads 


Ot Ppol +V- liconazpol + jconv,pol] = 0, 

(13.38) 
where jcond,pol is given by eq. (13.27). 
However, from eqs. (13.21) and (13.27), 
we have OtPpo1 + V-jcond,pol = 0 auto- 
matically, and therefore 


V-Jconv,pol = 0. 


"The notation Ptree is quite conven- 
tional. However, more precisely, this 
refers to any other source of charge, 
not related to the neutral dipoles inside 
the material. For instance, it could re- 
fer to an external charge that we have 
placed inside an otherwise neutral di- 
electric, as in Solved Problem 13.1, or 
to charges external to the material, that 
contribute to the electric field inside it. 
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8Note that there is no universal con- 
vention on the nomenclature of these 
fields. In some books H is called the 
“magnetic field” and B the magnetic 
induction. This reflects the historical 
development, when it was not yet un- 
derstood that B is a more fundamental 
entity than H. We will rather follow the 
alternative convention of calling B the 
magnetic field, while H will be called 
simply, the “H” field. 

Also note that H does not have the 
same dimensions as B, since pug is not 
dimensionless. Rather, from eq. (13.44) 
we see that V x H has the same di- 
mensions as jfree- Recalling that jee is 
a current per unit surface [see the dis- 
cussion below eq. (2.21)], we see that, 
in the SI system, H has dimensions of 
a current divided by a length, and is 
therefore measured in A/m. Similarly, 
from eq. (13.41), in the SI system D is 
measured in C/m?. 


where, if we also include convection, Jfree is given by eq. (13.36). Note 
that, taking the divergence of eq. (13.42) and using eqs. (8.27) and 
(13.39), one gets the continuity equation (13.37), as expected. 

Similarly to what we have done for the electric dipole contribution in 
Gauss’s law, the magnetization current can be reabsorbed into a redefi- 
nition of the magnetic field, 


H-_+B_M. 


F (13.43) 


Then, using also eq. (13.40) to eliminate E in favor of D, eq. (13.42) 


becomes 


oD 


VxH- Ot = jfree ` (13.44) 


Thus, the two Maxwell ’s equations involving the sources can be rewrit- 
ten more naturally in terms of D and H.8 The full set of macroscopic 
Maxwell’s equations in material media therefore reads 


VD = Pre, (13.45) 
oD 
H-— = j ree » 13.4 
V x Ot Jf (13.46) 
VB = 0, (13.47) 
B 
vx = 0. (13.48) 


Observe that, in the second pair of Maxwell’s equations, still enter B 
and E rather than H and D. Also note that, for instance, in general 
V-H Æ 0, since V-M needs not be zero. So, in general, there is no 
vector potential whose curl gives H, and therefore no scalar potential in 
terms of which we could express the D field. 

As a first simple application, consider electrostatics in materials. In 
this case, the two equations to be solved are 


V-D = pre, VxE=0. (13.49) 
Using eq. (13.40), we can rewrite them as 
V-D = Pree, VxD=VxP. (13.50) 


Comparing with eqs. (4.1) and (4.2) we see that the equations governing 
electrostatics in material are not just obtained replacing E with D/eo 
(and p with pfree) in eq. (4.1), since eq. (4.2) states that V xE = 0 while, 
from eq. (13.50), V x D 4 0. This also implies that we cannot write 
D as the gradient of a scalar function. However, we have already seen 
in Solved Problem 11.1 that a vector field is determined uniquely by its 
divergence and its curl, through Helmholtz’s theorem in the form given 
in eq. (11.174). From (13.50) we can therefore immediately write the 
solution for D, 


2 3_/ Pfree A 
=v (fo ea) (fae 


(V x P)(x’) 
4r|x — x’| we), 


51) 
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to be compared with eq. (4.22) in the vacuum case. Using eq. (13.40) 
we can rewrite this solution in terms of E, as 


B(x) = =y (fee ax) (13.52) 


ATEo |x — x’| 
f 
+ vx jee es) ea 
Arey |x — x’| €o 


In vacuum (in the sense used here, i.e., in the absence of bound charges or 
any other complexity due to a macroscopic material) we have P = 0 and 
P = Pfree, and we recover of course eq. (4.18) or, equivalently, eq. (4.22). 
In Solved Problem 13.1 we will apply these results to the case of the 
interface between two simple linear dielectrics. 

We can proceed similarly for magnetostatics. In this case the two 
equations to be solved are 


V-B=0, V x H = jree. (13.53) 
Using eq. (13.43), we can rewrite them as 
V-H=-V-M, V x HH jies: (13.54) 


Using again eq. (11.174), the general solution for H is 
Jiree(X’) f 31 (V:M)(x’) 
H = eo ax —— 13.55 
w vx(f i a Z irx) O59 


or, in terms of B, 


B(x) = vx (feet) (13.56) 


HY (/ da wa) + HoM(x). 


When the magnetization vanishes and Jfree is the same as the total cur- 
rent j, we recover eq. (4.94). 
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When we have boundaries between different materials, or between a 
material and the vacuum, to complete the specification of a problem we 
must also assign the boundary conditions. To this purpose, consider a 
cylinder of infinitesimal height h straddling the boundary surface be- 
tween two media, and let A be the area of the faces of the cylinder, with 
the upper face lying in medium 2 and the lower face in medium 1, as in 
Fig. 13.2. From eq. (13.45), 


| zvo- | Px Pfreo , (13.57) 
V V 


|| 


Fig. 13.2 An infinitesimal cylinder 
across the boundary between two 
media. 
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Note, however, that a formal math- 
ematical limit A — 0 is not physi- 
cally meaningful. The area A must still 
be larger than O(L?), where L is the 
length-scale introduced above to define 
the macroscopic equations of motion. 
The same is implicit for the formal limit 
h — 0 used below. 


Fig. 13.3 An infinitesimal loop 
across the boundary between two 
media. 


where V is the volume of the cylinder. For definiteness, let us take the 
z axis along the height of the cylinder, so 


+h/2 
fe] iay f dz. (13.58) 


In the limit h > 0, we have 


+h/2 
l dz Pfree = Ofree ; (13.59) 
—h/2 


where Ofree is the surface density of free charges. If, furthermore, we 
take A sufficiently small, so that Gfree can be taken constant over A,° 
then 


+h/2 
f dedy f dz Pires = AO free- (13.60) 


On the left-hand side of eq. (13. A in the limit h + 0 we have 


+h/2 
[eer = fea dz ð; Dz 
V h/2 


+h/2 
+f az f dxdy (0;Dz + OyDy) 
A 


- 
= A[D,(2)— D:(1)] + O(h), (13.61) 


where D,(2) is the value of D, as we approach the boundary from the 
medium 2, and D,(1) is the value when we approach the boundary from 
the medium 1. Therefore, sending h —> 0 with A sufficiently small so 
that free is constant over it (but still VA much larger than h), we find 


D,(2) = DAV) = Ofree - (13.62) 


For a boundary with a normal ñ (pointing from the medium 1 toward 
the medium 2) in a generic direction, rather than along Zz, we then have 


n-(D2 = Dı) = Ofree - (13.63) 


Applying the same reasoning to eq. (13.47), we get instead 


ñ (B2 — Bı) =0. (13.64) 


To get the boundary conditions for E and D we consider a one-dimensional 
loop C with the longer side L parallel to the boundary and the shorter 
side, of length h, straddling across the two materials, as in Fig. 13.3. In- 
tegrating eq. (13.48) over a two-dimensional surface S bounded by this 


loop, we get 
OB 
Vx E+ — 
fe pea] 


I ds-V x E+ Zf ds-B. (13.65) 
s ôt Js 


(= 
II 
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Let us take again the z axis in the direction of the normal to the interface, 
which in this case corresponds to the short side of the loop, and the x 
axis in the direction of the loop, so the long side of the loop goes from 
x = —L/2 to x = L/2, at fixed y and z. The normal to the surface ds is 
therefore along the y direction, so ds-B = dxæxdz By. Since By is finite at 
the surface, its surface integral vanishes as h > 0. Then, using Stokes’s 
theorem (1.38), in the limit h — 0 we get 


0 = [osvxe 
S 
= | æ 
Cc=as 


L/2 h/2 
= J dz Ealey, z = —h/2) + | dz E(x = L/2,y, z) 
—L/2 —h/2 


—L/2 —h/2 
+f dz Ealey z =h/2)+ f dz E(x = —L/2,y, z) 
h/2 


dx |Ez(£,y,z = h/2) — Ez (£, y, z = —h/2)] 


ll 

| 
m 
o 


h/2 
+f de [B.(e=2/2,y,2)- Ele = -2/2,y,2)]. (13.66) 
—h/2 

The last integral vanishes as h — 0, since the electric field is finite on 
the boundary, while, taking L sufficiently small so that the variation of 
the field with x can be neglected, we get 


0=L[E,(z,y,z > 0°) — E,(2,y,2z 4 07 )] ‘ (13.67) 


and therefore Ey is continuous across the boundary. The same argument 
could be made for a loop in the (y, z) plane, leading to the conclusion 
that also E, is continuous, so the electric field transverse with respect 
to the normal to the interface is continuous. For a generic normal n, not 
necessarily oriented along z (again with the convention that it points 
from the medium 1 toward the medium 2), we can write this as 


ñ x (E2 — E1) =0. (13.68) 


Finally, we can repeat the same argument for eq. (13.46). Now the 
integral of D over ds vanishes because D is finite on the boundary (just 
as the integral of B in the previous computation). We define the surface 
current of free charges as 


+h/2 
Keree = J dz jtree ; (13.69) 
—h/2 


(for an interface whose normal is along the z axis, with dz replaced by 
n-dx for generic ñ). Then, with the same computations as before, we 
get 


n x (H2 = Hı) = Kerec G (13.70) 
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10The factor co has been inserted in 
eq. (13.71) so that Xe is a dimension- 
less quantity. 


11 The dielectric constant in the SI sys- 
tem is often also denoted by Ke. 


2The situation is the same as illus- 
trated in Fig. 13.1, except that, if 
Ptree = 0, we have V-D = 0 and then, 
for a linear dielectric, also V-P = 0. In 
the absence of a gradient of the polar- 
ization vector, the dipole charges inside 
the volume cancel each other, and only 
the surface charge remains. 


13.6 Constitutive relations 


Finally, in order to make use of eqs. (13.45)—(13.48), we need to know 
how D and H are related to the fundamental fields E and B (i.e., to 
have a model for P and M). These are called constitutive relations, 
and it is at this stage that the great variety of materials enters. It 
should be stressed that such relations are not fundamental, contrary 
to Maxwell’s equations in vacuum. Rather, they are phenomenological 
relations, assumed to catch the main properties of a material, at least 
for some range of parameters (such as the strength or frequency of the 
electric field). We now discuss some common constitutive relations. 


13.6.1 Dielectrics 


A dielectric is defined as a material with no (or very little) free charges. 
Since the electric conductivity is given by the free charges, dielectrics 
are insulators. A linear dielectric material is defined by the condition 
that the polarization P induced by an applied electric field E is linear 
in E. In its simplest form, this means that 

P(t, x) = e9x%e E(t, x), (13.71) 
for some constant Xe, which is called the electric susceptibility.!° From 
eq. (13.40), this implies that 


D(t,x) = E(t, x), (13.72) 


where the permittivity of the material, denoted by €, is given by 


€=e9(1+ Xe). (13.73) 
The dielectric constant (or relative permittivity) of the material is defined 
as €r = €/€0,!! so 


e= 1+xe. (13.74) 


This constitutive relation implies a simple relation between the polar- 
ization charges and the free charges of the medium: from eqs. (13.71) 
and (13.72), 


P(t, x) = SD (13.75) 
Using eqs. (13.21) and (13.45), we then obtain 
ppoi(t,x) = —“<£V-D(t,x) 
Xe 
= > an ota ree t, : 13. 
Tx, Piel) (13.76) 


Observe that the density of bound charges has the sign opposite to that 
of the free charges, and vanishes if pfree = 0. For a perfect insulator there 
are no free charges, so also ppo1 = 0. Therefore, in a linear dielectric, 
all polarization charge resides on the surface of the material.‘? Note 


also that Xe is always positive because an external static electric field 
induces an electric dipole moment in the same direction, so P has the 
same direction as E. Therefore, in all materials described by eq. (13.72), 
er > 1. 

A simple constitutive relation such as eq. (13.72) is, however, only 
valid for quasi-static fields. More generally, actual linear dielectrics are 
described by a constitutive relation of the form 


D(w) = e(w)E(w), (13.77) 


where D(w) and E(w) are the Fourier modes of D(t) and E(t), respec- 
tively and, as it is customary in these equations, we have omitted the 
tilde over the Fourier transform of e(t) (the fact that e(w) is the Fourier 
transform is anyhow clear from its argument w). In Section 14.2 we will 
discuss the form of this relation in the time domain and the constraints 
imposed by causality on the function e(w), while in Section 14.3 we will 
develop a simple explicit model for e(w). 

Note that e(w), being the Fourier transform of a real function, is a 
complex function. We will refer to e(w) as the permittivity function of 
the material, and to €,(w) = e(w)/€o as the relative permittivity function 
(or the dielectric function) of the material. The permittivity € defined 
in eq. (13.72) is the limit of the permittivity function for w > 0, i.e., for 
static fields and, similarly, the dielectric constant €y = €/€9 is the limit 
of the dielectric function e¢,(w) for w — 0. As an example, Fig. 13.4 
shows the real and imaginary parts of the dielectric function of water, 
at different temperatures. The dielectric constant €p = €,(w = 0) can 
take values over a broad range.!? 

In anisotropic materials, eq. (13.72) generalizes to D; = €;;E;, so 
the permittivity is promoted to a spatial tensor €;;, and the dielectric 
constant becomes a tensor (€,);; = €i;/€9. Correspondingly, eq. (13.77) 
has the anisotropic generalization 


D,(w) = €:j(w)Ej(w). (13.78) 


As with any phenomenological relation, eqs. (13.72), (13.77), or (13.78) 
have a finite range of validity. In particular, the assumption of linear 
response breaks down when the external field E exceeds a critical thresh- 
old, leading to dielectric breakdown. Beyond this critical field, current 
flows in the material and the insulator becomes conductive. For instance, 
for air, when the electric field exceeds about 10° V/m, the air molecules 
become ionized and air becomes a conductor. Lightning occurs when 
there is a sufficient build-up of electric charges between the clouds and 
the ground, so that the threshold value of the electric field is exceeded. 
Then, air becomes conductor along a path where the charges flow from 
the cloud to the ground. In Solved Problems 13.1 we will examine some 
simple examples of electrostatics in dielectrics. 
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Fig. 13.4 (Relative) permittivity 
function e€,(w) for water, as a func- 
tion of frequency f = w/(27), for 
different temperatures from T = 
0°C to T = 100°C [with lower tem- 
peratures corresponding to higher 
values of €,(w = 0)]. Solid lines 
correspond to the real part, dashed 
lines to the imaginary part. From 
Andryieuski et al. (2015) (Creative 
Commons Attribution 4.0 Interna- 
tional). 


13 For example, for water, as we see 
from Fig. 13.4, er has a significant 
dependence on the temperature, and 
ranges from about er ~ 88 at 0° C, 
to er ~ 55 at 100° C, with er ~ 80 
at 20° C. However, in its gaseous state, 
at 110° C and 1 atm, it drops to er ~ 
1.012. For air at 1 atm, er œ 1.00054, 
which becomes 1.0548 at 100 atm. For 
some special materials, ep can reach 
values of order 103—104 (e.g., 2x 10° for 
Barium Titanate and larger than 104 
for Calcium Copper Titanate). We will 
see in eq. (13.104) that the capacitance 
of a capacitor filled with a dielectric 
is enhanced by a factor er compared 
to the vacuum case. Then, materials 
with such a huge dielectric constant can 
make super-capacitors. 
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13.6.2 Metals 


In metals, there are both bound and free charges. The effect of bound 
charges is still described, in a linear approximation, by eq. (13.77), [or 
by its generalizations (13.78)], that we now write as 


D(w) = & (w) E(w), (13.79) 


where the subscript “b” in e,(w) stresses that this is the contribution of 
the bound electrons. The free electrons, in contrast, generate a current 
density jee. The simplest constitutive relation describing the current 
generated by the free electrons is Ohm’s law, which states that the steady 
current of free charges generated by an applied static electric field is 
given by 

jfree = OE, (13.80) 


where ø is called the conductivity. Similarly to eq. (13.72), this relation 
only holds in the static limit, and can be generalized to a frequency- 
dependent relation, as 


jtree(w) = o(w)B(w). (13.81) 


Again, we can generalize to anisotropic media, writing (jrrec)i(w) = 
0i;(w)E;(w), and such linear relations are valid only up to a maximum 
value of the electric field. When we use the frequency-dependent con- 
ductivity, we will use the notation 


olw =0)= 00, (13.82) 


for the zero-frequency, or “d.c.” conductivity (in contrast to the fre- 
quency dependent, or “a.c.” , conductivity). In Section 14.4 we will study 
in detail a simple model for o (w). 

Let us now consider the pair of Maxwell’s equations in matter in- 
volving the sources, eqs. (13.45) and (13.46). We perform the Fourier 
transform with respect to time only, writing for instance 


O d sci ee 
Ptree(t, xX) = J. on eo" relw, x), 13.83) 
and similarly for all other quantities. Then, eqs. (13.45) and (13.46) 
become 
V-D(w,x) = Ďrelw,x), 13.84) 
V x H(w,x) +iwD(w,x) = jrelw,x), 13.85) 


while the continuity equation (13.25) becomes 
—iwPptree(w,X) + V Jrcelw, x) = 0. (13.86) 


Then, eq. (13.84) can be written as 


” 1 sh 
V-D(w,x) = i V ‘Jtree(w, xX) . (13.87) 
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We now use the constitutive relations (13.79) and (13.81), and eqs. (13.87) 
and (13.85) become, respectively, 


e(w)V-E(w,x) = 0, (13.88) 
V x H(w,x) + iw e(w)E(w, x) 0, (13.89) 

where 
E(w) = enw) + joe) (13.90) 


We see that the response function of a metal is not determined sepa- 
rately by the two functions e(w) and o(w), but only by their combi- 
nation (13.90) (apart from a function that relates H to B, that will be 
introduced in Section 13.6.3, and that can be set to one in many typical 
situations). The function e(w)/eo, with e(w) defined by eq. (13.90), is 
called the dielectric function of a metal. The two terms in eq. (13.90) 
describe the separate contributions of the bound and free electrons. The 
fact that they combine into a single function is a reflection of the fact 
that (at least within a classical description) bound electrons oscillate 
around their equilibrium position with a set of oscillation frequencies 
w;, and a free electron can just be seen as a bound electron with a 
frequency wo = 0, or anyhow wo much smaller than the other typical 
frequency scales in the problem, which can be simply added to the set 
of all other bound electrons. We will see this explicitly in Section 14.5, 
where we use an explicit model for o(w) and €p(w). 


13.6.3 Diamagnetic and paramagnetic materials 


Similar to the case of dielectrics, in many materials the magnetization 
M is linearly proportional to the external magnetic field B. Because of 
eq. (13.43), a linear relation between M and B implies a linear relation 
between M and H. One then defines the magnetic susceptibility Xm 
from 


M = XnH. (13.91) 


This is the simplest constitutive relation for magnetic matter. Equa- 
tion (13.43) then implies a linear relation also between B and H, which 
is written as 

B= pwH, (13.92) 


where 
H= po(1+ Xm). (13.93) 


The constant p is called the (magnetic) permeability of the material. Just — , ian E EE eaten cult 
as in the case of the dielectric constant (or relative electric permittivity) je denoted ‘by ke, the relative A 
€r = €/eo, one defines the relative magnetic permeability as p/uo-'f ability p Juo is denoted by km. Note 
Note that, for historical reasons, eqs. (13.91) and (13.92) are written as that some texts, such as Garg (2012), 
if H were the fundamental field and B the derived field, while we now Teserve the name “magnetic permeabil- 
understand that the opposite is true. ity! to WAU rather than to j: 
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For a medium such that the constitutive relation is given by eq. (13.91), 
and in static situation, so that OD/0t = 0, combining eqs. (13.32) and 
(13.46) we find a relation between the magnetization current and the 
current of the free electrons, 


Jmag (t, x) = XmJfree ; (13.94) 


to be compared to eq. (13.76) for the dielectric case. Contrary to the 
electric case, however, Xm can be either positive or negative. When 
Xm > 0 the material is called paramagnetic, while when Xm < 0 is called 
diamagnetic. Paramagnetism arises from the fact that an external mag- 
netic field orients already pre-existing magnetic dipole moments, due e.g. 
to the spin of the electrons. The pre-existing magnetic moments align 
with the external magnetic field, so ym > 0. Diamagnetism is instead 
due to the magnetic dipoles induced by the external magnetic field. As 
we will show in Solved Problem 13.3, the magnetic moment induced in 
an atom by an external magnetic field is in fact in the direction opposite 
to the magnetic field. 


13.7 Energy conservation 


Performing on Maxwell’s equations in material bodies, eqs. (13.45)— 
(13.48), the same manipulations that we performed on the vacuum 
Maxwell’s equations in Section 3.2.2, we find 


OD OB 
3 ; ; | 3 4 NR s 
[ee (x Ft +H =) pf eB ine [ exe), 


(13.95) 
that reduces to eq. (3.35) in vacuum, where D = eoE and H = pp ‘B. 
The term on the right-hand side gives the generalization of the Poynting 
vector to material media, 


S=ExH. (13.96) 


The term E-jfree is still the work made on the system of charges by the 
electric field. Note, however, that the first term on the left-hand side 
of eq. (13.95) is no longer a total derivative, at least in general. This is 
not surprising, because energy balance in a medium must now include 
dissipative processes such as the production of heat. If however the 
medium is linear, i.e., D = cE and B = uH, and dispersion-less, so that 
c and p are independent of time, we can rewrite eq. (13.95) as 
oof dr (E? + u*B?) +f dgr E-jtrec = = ds. (E x B) 5 
V V H Jav 

(13.97) 
and we can therefore identify the energy density the electromagnetic 
field inside the material with 


- u~ *B?) 


u = 


uH’), (13.98) 


— 


Ol el NOON ls 


while the Poynting vector (13.96) can also be written as 
1 
S= Ti xB, (13.99) 


to be compared with the expression in vacuum, eq. (3.34). 
13.8 Solved problems 


Problem 13.1. Electrostatics of dielectrics 


We discuss here some aspects of electrostatics of a simple dielectric, with 
the constitutive relation (13.72), D(t, x) = «E(t, x), with e constant. In this 
case, inside the dielectric, eq. (13.49) becomes 


V-E L pire l (13.100) 


VxE = 0. (13.101) 


The situation is therefore formally identical to eqs. (4.1) and (4.2), with eo 
replaced by e. So, for instance, if we place a single point charge q inside an 
infinite dielectric medium, instead of eq. (4.7) we will get 


B= — =. (13.102) 


Formally, this is the same as the field that would be generated, in vacuum, 
by a charge q/er. Since er = €/co > 1, this means that the charge is screened, 
compared to the vacuum situation. Physically, what happens is that the 
presence of the charge q partially orients the dipoles around it so that, on 
average, any sphere centered around q, beside the charge q itself, also contains 
an excess of charges of opposite sign, coming from the dipoles, which therefore 
partially screens it. 

Similarly, for a parallel-plate capacitor filled with a dielectric, proceeding 
as in Solved Problems 4.3, we see that the electric field outside the plates 
vanishes. The electric field inside the capacitor that, in the vacuum case is 
given by eq. (4.153), is now given by’? 


Bug, (13.103) 
€ 


where Z is the unit normal to the plates. Therefore, for a parallel-plate capac- 
itor, the capacitance C becomes 


C= 2 ; (13.104) 
d 
so it is larger by a factor €r compared to the value in eq. (4.158). Filling a 
capacitor with a dielectric with large dielectric constant is therefore a way of 
increasing its capacitance. 
Inside the dielectric eq. (13.71), together with eq. (13.101) and the assump- 
tion that e (and therefore xe) is constant, implies 


VxP=0. (13.105) 


Naively, one might then think that, in eq. (13.51), only the integral involving 
Pfree contributes. Actually, this is no longer true when we consider a realistic 
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15 For instance, one can repeat the 
computation performed in eqs. (4.139)— 
(4.141), summing over the contribution 
to the electric field of the individual 
surface element of each plate. The 
computation is then formally the same, 
with €o replaced by e. 
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16 Explicitly, in components 

(VxP)i = ijn (Oje)Ex, (13.107) 
where, we used eq. (13.101). Then, us- 
ing eq. (13.106), 


de(z) 
Die = d 
je j3 dz 
= Êj (e2 = €1)6(z) . 
From eq. (13.73), we have 


(e2 — €1)E 


(€0Xe,2 — €0Xe,1)E 


= P2-Pi, 


where Pz and P; are the polarization 
vectors of the two dielectrics. Then, 


ijn (Oj) ER = (€2 — €1)ð(2)cijk2j Ek 
=> 5(z)€i;%%;(P2 = Pi)k P 


situation in which the dielectric has a finite extent and therefore a boundary. 
Consider, for simplicity, the situation in which there is a flat boundary, iden- 
tified with the plane z = 0, separating the region with z < 0, filled with a 
dielectric with permittivity €1, from that with z > 0, which is filled with a 
dielectric with permittivity €2. Even if €ı and €2 are constant, the permittivity 
changes abruptly at the interface. We can write 


e(z) = é + (€2 — €1)0 (z), (13.106) 


where, as usual, 6(z) is the Heaviside theta function. Combining eqs. (13.71), 
(13.73), and (13.101), we then obtain’® 


VxP = ô(2) 2x (P2 — P1), (13.108) 


and the integral in the second term in eq. (13.51) can be rewritten as 


fae (V x P)(x’) = f as'ay 2 DP 
S 


13.1 
4r|x — x’| 4r|x — x’| , (18109) 


where S' is the (x’,y’) plane. For a generic curved boundary surface S, with 
surface element d?s’ = d’s'iis(x’), this generalizes to 


[ee De) — f age Pne) 
S 


13.11 
4r|x — x'| 4r|x — x’| gaia 


In particular, if the medium 1 is a dielectric with polarization Pı = P and 
the medium 2 is the vacuum (so that ñ, is the outer normal to the surface of 
the dielectric), eq. (13.51) becomes 


D(x) =-V (fè — -Vx (J ès —— . (13.111 


If we have an ensemble of dielectrics, labeled by an index i, embedded in free 
space, we must include the contribution from all their boundaries, so 


D(x) = -V (fè: of Pel >) - vx (=/, 


ds Tis (x ')x P(x ) 
4r|x — x’| i 
(13.112) 


Problem 13.2. Magnetostatics of simple magnetic matter 


We can discuss the magnetostatics of simple magnetic matter in a com- 
pletely analogous manner, with the constitutive relation (13.92), B = 4H, 
where y is a constant. In this case, inside the material, eq. (13.53) becomes 


VB = 0, 
VxB = 


(13.113) 


Lb Jfree . (13.114) 


The situation is therefore formally the same as eqs. (4.67) and (4.68), with jo 
replaced by pz, and j by jrree. Equations (13.91), (13.92), and (13.93) imply 
that 


M = Xn 1 
1+xXm Ho 


1 
= Xm —B, 
Ho 


(13.115) 


where we used |Xm| « 1. Inside the material, i.e., as far as we do not cross 
boundaries between different magnetic materials or between a magnetic ma- 
terial and the vacuum, js (and therefore xm) is constant. Then eqs. (13.113) 
and (13.115) imply that 

VM=0. (13.116) 


This is analogous to eq. (13.105) in the electrostatic case. Just as in the 
electrostatic case, this is no longer true at the boundaries between different 
material, where Xm changes discontinuously. Proceeding as in eq. (13.106), 
we can consider a planar interface at z = 0, where 


Xm(Z) = Xm,1 + (Xm,2 — Xm,1)0(zZ) . (13.117) 
Then, using eq. (13.115) together with V-B = 0, we get'” 
V-M = 6(z) 2 (M2 — Mı). (13.118) 


Then, proceeding as in eqs. (13.108)-(13.112), we find that, for an ensemble 
of magnetic materials embedded in vacuum, eq. (13.55) becomes 


H(x) = Vx (fae (det) ev (=/, ds owe saa . (13.119) 


Problem 13.3. Diamagnetism 


In this Solved Problem we show that an external magnetic field induces, 
in a classical atom, a magnetic dipole in the opposite direction, leading to 
diamagnetism. To illustrate the effect, consider an electron with charge —e 
[recall that e > 0 in our conventions, see eq. (2.4)] kept in a circular orbit 
of radius r by the Coulomb interaction with a nucleus of charge Ze, in a 
purely classical description. We take the orbit to be in the (x,y) plane, and 
the electron moving counterclockwise. In the absence of magnetic field, the 
electron rotates with frequency wo (taken positive by definition) given by 


1 Ze? 
mewer = ea ST (13.120) 
i.e., 
1 Ze? 3 
oe (= = =) . (13.121) 


Since the motion is counterclockwise, the angular momentum of the electron is 
in the positive z direction, with L; = +mewor?. Then, according to eq. (6.43) 
with qa = —e, the magnetic moment of the electron due to its orbital motion 
is in the negative z direction, 


1 
(mz)o = —Zewor” , (13.122) 
where the subscript 0 in (mz)o stresses that this is the result in the absence 
of magnetic field. We now add a magnetic field along the positive z direction, 
B = Bz. Using the Lorentz force equation (3.5), the total force acting on the 


electron becomes 5 
1 Ze 
= ——fĉ-—evxB. 
~ Aneg r? 
For an electron that moves counterclockwise in the (x, y) plane, and a magnetic 


field B along the positive z axis, we have vxB = vB?Ŷ, so the Lorentz force 


(13.123) 
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17 Explicitly, 
dym 1 
V-M = 2H 
dz uo 
L; 
= 5(z)(Xm,2 = Xm,1)— ê: B 
Ho 
= ô(z) 2- (M2 — M1). 


I8As we saw in the derivation of 
eq. (10.225), a purely classical descrip- 
tion actually leads to the paradox that 
the electron would collapse on the nu- 
cleus on a very short timescale, because 
of the emission of radiation. A first- 
principle computation should therefore 
be performed within quantum mechan- 
ics. However, the classical computa- 
tion that we present here is already 
sufficient to understand the sign and 
the typical size of the perturbation in- 
duced by a weak magnetic field, on an 
equilibrium state determined quantum- 
mechanically. 


364 Electromagnetic fields in material media 


has the same inward radial direction as the Coulomb force, and the rotation 
frequency w is now determined by the positive solution of 


2 
ete abi: (13.124 
Areo r2 
This can be rewritten as 
w? = wi + 2wwL, (13.125 
where wo is given by eq. (13.121) and 
eB 
= 13.126 
wr E gmo’ ( 


is called the Larmor frequency. Note that wz > 0. In the limit wr < wo, 
i.e., for magnetic fields that do not exceed a (rather large) critical value, the 
solution of this equation for w is 


w wo twL. (13:127) 


Since both wo and wy are positive, the effect of the magnetic field is to in- 
crease to rotation frequency, corresponding to the fact that, in our setting, 
the Lorentz force has the same inward radial direction as the Coulomb force. 
The resulting magnetic moment is therefore 


Mma = il 
2 
~ (mz)o- ewr? 
er? 
= Hos i 13.12 
(mz)o = S (13.128) 
or, in vector form, 
er 
= — : 13.12 
m = mo iri (13.129) 


Therefore, the induced magnetic moment is opposite to the magnetic field. 
Consider now a collection of atoms with random orientations, in an external 
magnetic field. The orbital magnetic moment mo averages to zero, since the 
orientations of the atoms are random, while the term induced by the magnetic 
field has a non-vanishing average value, obtained replacing in eq. (13.129) r? by 
(rÈ), where r1 is the projection of the orbital radius on the plane orthogonal 
to the magnetic field. If the number of atoms per unit volume is n, the 
magnetization is therefore 


2/2 
m=- rip, (13.130) 
Ame 
and therefore, using eqs. (13.91)—(13.93), the magnetic susceptibility is given 
by 2/2 
Xm ne (ri) 
= ; 13.131 
fie HO, ( ) 


Numerically, |Xm] is always a very small number, typically of order 10~°—10~° 
for paramagnets and 10~° for diamagnetic materials. Then, eq. (13.131) can 
be written more simply as 

ne? (ri) 


Xm ~ — Ho (13.132) 


AMe 


Frequency-dependent 
response of materials 


As we have mentioned in Section 13.6, the response of a material, and 
therefore the corresponding constitutive relation, is in general frequency 
dependent, as in eqs. (13.77) or (13.81). In this chapter, we will first see 
how such frequency dependence is significantly constrained by causality 
and other general principles. We will then discuss explicit models for 
the frequency dependence of the response functions of dielectrics and of 
conductors. 


14.1 General properties of o(w), e(w) 


To illustrate general properties of the frequency dependence of the re- 
sponse functions, we begin with the relation jrree(w) = o(w)E(w). First 
of all, from the fact that Jfree(t) and E(t) are real functions, it follows that 
their Fourier transforms satisfy Jý ee(w) = jiree(—w) and E*(w) = E(—w), 
and therefore 


a*(w) =a(-w). (14.1) 
We separate o(w) into its real and imaginary parts, 
olw) = orlw) + ior(w). (14.2) 
Then eq. (14.1) implies that (for real values of w) 
or(W) =oR(-w), or(w) = —o7(—w). (14.3) 


The same relations hold for e(w). Another property, this one specific to 
a(w), follows from the fact that, according to eq. (13.95), the work made 
on the free charges of the material by the electric field is 


dE 
— = | PxrE jee. 14.4 
di I LE Je ( ) 
This energy is dissipated by the free electrons through collisions in the 


material, so the energy dissipated per unit volume and unit time is 


dE 


—— = 14. 
dV dt eon) 


‘Jfree » 
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1 . and the energy per unit volume dissipated from t = —oo to t = +00 is! 
Explicitly: 
+00 dE T , 
[. dt E(t ) Jeree (t ) dV = J dt E(t)-Jfree (t) 
—co 


= [a [oe dus! 5 i w’) —iw't 
Je 


+00 w . 7 
| S [or(w) + ior(w)| E(w) E(—w). (14.6) 


+00 =09 
oi ae oe l l 
-œ 2T The term E*(w)E(—w) is an even function of w, and therefore using 
_ all OS 3 eq. (14.3), the integral involving o;(w) vanishes, while that over øp is 
an 2 twice the integral from w = 0 to w = oo. Using also E*(w) = E(—w), 
3 [- dte7twtw')t we finally get p 
Taf Fonlw)lBW)P (14.7) 
+00 1 = m =. = ~ OR . . 
= fF Sow) Bw) BW) dV Jo 2m 


By the second law of thermodynamics, the dissipated heat must be pos- 


276 j 
sl a j k . itive, for an arbitrary function E(w). Therefore, for each Fourier mode, 
=| gpl OPU) B(w). we must have op(w)|E(w)|? > 0, so 


or(w) > 0. (14.8) 


Another rather general property of a(w) is that it goes to zero as w — oo. 
This is a consequence of the fact that, if an electric field oscillates with 
a frequency much higher than the natural frequencies of the microscopic 
medium, the electrons cannot follow its oscillations, and no bulk move- 
ment, giving rise to a macroscopic current, is induced. In Section 14.4 
we will check this behavior on a simple but explicit microscopic model 
for o(w). In fact, we will see that both or(w) and a7(w) go to zero as 
w — œ, but oR(w) goes to zero faster than o;(w). 

Analogous considerations can be made for e(w). As in eq. (14.3), the 
reality of e(t) implies that 


€r(w) = €r(—-w), er(w) = —e€7(—w). (14.9) 
It is convenient to introduce also y-(w) from 
e(w) 
€0 


which generalizes eq. (13.73), so that eq. (13.77) reads 


=1+xe(w), (14.10) 


D(w) = P(w) + €0X%e(w) E(w). (14.11) 
or, equivalently, from the Fourier transform of eq. (13.40), 
P(w) = €0Xe(w)E(w). (14.12) 


Writing Xelw) = Xe,rR(w)+iXe,r (w), from eq. (14.9), Xe,R(w) = Xe,R(—w) 
and Xe, (w) = —Xe,1(—w). If the frequency of the external electric field 
is too high, it cannot induce a net polarization in the material, so Xe(w) 
goes to zero as w — oo, and correspondingly e(w)/co goes to one. We 
will check this behavior on an explicit microscopic model in Section 14.3. 
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14.2 Causality constraints and 
Kramers—Kronig relations 


We next consider the constraints imposed by causality. Fourier trans- 


forming eq. (13.81) we get? ? Explicitly: 
+20 du) ~ ; 
+00 : —iwt 
ree t) = a Jiree 
Jires(t) = | di'o(t —t’)E(t’). (14.13) eee de co 
—co — TE = o(w)E(w Je —iwt 
If o were a generic function of time, this equation would mean that the RAA i A 
value of jfree(t) at a given time ¢ is determined by the applied electric = / — dt!'a(t! eter" 
field E(t’) at all possible values of t’, both in the past, t < t, and in the Tee a = 
future, t > t. This clearly violate causality. We therefore discover that a x l dt! E(t’ ett’ et 
relation such as eq. (13.81) only makes sense, physically, if o(t— t’) = 0 eae 
for t > t, ie., if the function o(t) vanishes when its argument t is =f dt! dt" a(t! E(t’) 
negative: Soe 
ot +t" —t 
a(t) =0 ift <0. (14.14) j ) 


= I dt'o(t— t')E(t’). 


o0 


Then, eq. (14.13) becomes 


Jere (t) =| dt’ o(t —U)E(¢’), (14.15) 


=00 


consistent with causality. Equation (14.15) implies that the relation 
between jfree(t) and E(t) is non-local in time: the value of Jjfree at time t 
depends not only on the value of E (and, possibly, of a finite number of 
its derivatives) at time t, but rather on the whole past history, i.e., on 
the whole behavior of E(t’) at t < t. On physical grounds, we expect 
that a(t — t’) will go to zero sufficiently fast as t + —oo, i.e., that o(t) 
goes to zero sufficiently fast when its argument t — +00, so that the 
response of the system only retains a memory of the recent past. In the 
limit in which o(w) becomes independent of the frequency, a(t) becomes 
proportional to a Dirac delta, o(t—t’) x d(t—t’), and eq. (14.13) becomes 
local in time. Any frequency dependence in o(w) will, however, result 
in a relation between Jfree(t) and E(t) which is non-local in time. 

The condition (14.14) can be translated into a condition on a(w), as 
follows. First of all, we observe that 


gies J T tolte 


= | dt a(t)e™, (14.16) 
0 


since the integrand vanishes for t < 0. This equation can be used to 
define o(w) for complex values of w. Writing w = wr + iwr, 


+60 
a(w) = J dt a(t)eFte— ert (14.17) 
0 
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3We will check whether this is true on 
explicit examples, and we will then see 
how to amend the arguments below, 
when o(w) or e(w) have poles on the 
real axis, see also Note 6 on page 370. 


Fig. 14.1 The integration contour in 
the complex w’ plane discussed in 
the text. 


4Found by Ralph Kronig in 1926, and 
Hans Kramers in 1927. 


In the upper half-plane, wr > 0, the integral is strongly convergent, since 
also t > 0. Therefore the integral is well defined. This reasoning does 
not hold at wr = 0, and we then assume that o(w) has no singularity 
also on the real axis.? In that case, we can conclude that the function 
a(w) is analytic in the upper half-plane. 

This analyticity can be used to prove the Kramers—Kronig relations, 
as follows. We consider the integral 


/ 
f w ZE, (14.18) 
C W — 


where w is real and the contour C is shown in Fig. 14.1. Since o(w’) is 
analytic in the upper half-plane, and the contour avoids the singularity 
in w’ = w, the integrand is analytic everywhere inside C and therefore, 
by the Cauchy theorem, the integral vanishes. As we discussed above, 
for physical reasons o(w) must go to zero as |w| — oo on the real axis, 
and this extends to the whole upper half-plane, thanks to the factor 
e™1t in eq. (14.17) (note that, in eq. (14.17), the integration domain is 
over t > 0). Therefore a(w’)/(w’ — w) goes to zero faster than 1/|w’| as 
\w’| — oo on the upper half plane, and therefore the contribution from 
the semi-circle at infinity vanishes. The integral on the real axis, with the 
small semi-circle excluded, is just the definition of the principal part of 
the integral, while the integral over the small semicircle, by the Cauchy 
theorem, is —iz (where the minus is due to the fact that the semicircle 
is followed clockwise) times the residue of the function o(w’)/(w’ — w) 
in w = w, which is just o(w). Then, 


+00 1 
P| dw pae —ino(w) =0, (14.19) 
ae w! — w 


where the symbol P denotes the principal part. Separating into the real 
and imaginary parts, we get 


1 +oo 1 

oglw) = f ao, (14.20) 
1 +oo ri 

ow) = -+p f ape, (14.21) 


These relations (together with similar ones obtained for other functions 
such as e(w), see later) are examples of Kramers-Kronig relations,* or 
dispersion relations. Note that the knowledge of or(w) over the whole 
real axis fixes 77(w), and vice versa. We can rewrite eq. (14.21) in the 
form 


ort) = fy - aj 7H) _Lp fri ay) 72!) 
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| 
ale 
tg 
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ai 
qQ 
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Ia 
| 
ale 
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o 
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SJs 
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bo 


(14.22) 


14.2 Causality constraints and Kramers—Kronig relations 369 


where, in the second line, we have introduced the variable w” = —w’ 


in the first integral, we have then used op(—w”) = op(w”), and we 
finally renamed the integration variable w” as w’. Performing the same 
manipulations on eq. (14.20) we see that the Kramers—Kronig relations 
can be rewritten as 


2 w'oz(w" 

arw) = —P f d pau) (14.23) 
2w F orlw’) 

arw) = =p | auf EY. (14.24) 


From these relations, we can try to extract the high-frequency limit 
w — œ of or(w) and o;(w). Actually, the point is somewhat delicate 
mathematically, since also the integration region of w’ ranges up to in- 
finity. Consider first the limit w — oo of eq. (14.24). Let us assume 
that or(w’) goes to zero sufficiently fast as w — oo so that, for large 
w, the integral in eq. (14.24) gets most of its contribution from the inte- 
gration region w’ <w. In the limit w — oo we can then try to replace 
1/(w'* — w?) by —1/w? inside the integral, which gives 
2 co 
or(w) ~ — 7 du’ or(w’), 
0 


(w => 00). (14.25) 


TW 
This way of extracting the large frequency limit can be correct only if 
TR(w) goes to zero faster than 1/w at large w, so that the integral in 
eq. (14.25) converges. As we will see, this is the case in the explicit 
microscopic models that we will study in Section 14.4, where op(w) ~ 
1/w? at large w. Therefore or(w) goes to zero as 1/w, with a coefficient 
determined by the integral of ap(w’) over all frequencies. Relations of 
this form are called sum rules.° 

Another useful sum rule is obtained setting w = 0 in eq. (14.23). This 
gives the static conductivity, i.e., or(w = 0), as an integral over all 
frequencies of the imaginary part, 


2 co 
an(w=0) == | dw 
0 


Observe that the property (14.3) implies that, as w > 0, az (w) vanishes 
at least as w (or, more generally, as an odd power of w), and therefore 
the integral in eq. (14.26) converges. 

Similar considerations can be performed for e(w). 
going back into the time domain, we get 


1 o1(w') 
wh 


(14.26) 


From eq. (14.11), 


+oo 


D(t) = &E(t) + of dt'Xelt — E(t). (14.27) 


—0O 


Therefore, causality implies again that y-(t) = 0 for t < 0, so that 
eq. (14.27) becomes 


D(t) = eoE(t) +e f dt'y.(t —t’)E(t’). 


—oco 


(14.28) 


"Note that we cannot extract the 
high-frequency limit of oR(w) in the 
same way as we did in eq. (14.25). 
If we would naively take the limit 
w —> oo by replacing 1/(w’? — w?) 
with —1/w? inside the integral in 
eq. (14.23), we would get that, at 
large w, oR(w) is equal to —2/(mw?) 
times [J  dw'w'oar(w'). However, hav- 
ing already found from eq. (14.25) that 
o1(w’) goes as 1/w’ at large w’, we see 
that this integral does not converge, 
and keeping the term w’? in 1/(w/? — 
w?) is essential for the convergence. We 
will see, however, that in the explicit 
model of Section 14.4 op(w) still goes 
as 1/w? at high frequencies. 
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® This is not always the case for func- 
tions that describe the response of a 
material. For instance, for metals we 
see from eq. (13.90) that the permit- 
tivity function e(w), and therefore also 
the corresponding expression for Xe(w), 
has a pole at w = 0, since o(w) has 
a finite value at w = 0 (equal to the 
static conductivity oo). In general, if 
a function f(w) has a simple pole at 
w = @, of the form f(w) ~ fo/(w — @) 
for w — ©, one can write the disper- 
sion relation for the function g(w) = 
f(w) — fo/(w—@), from which the pole 
has been subtracted, and derive the 
Kramers—Kronig relations for g(w); this 
is equivalent to modifying the contour 
C in Fig. 14.1 with another semi-arc in 
the upper half plane, so as to avoid also 
the pole of the function f(w). 


T Observe that we could not have writ- 
ten the Kramers—Kronig relations di- 
rectly in terms of €,(w), because €,(w) 
goes to one, rather than to zero, as 
|w| + oo, and therefore the integral on 
the semi-circle at infinity, in the con- 
tour shown in Fig. 14.1, does not van- 
ish. The dispersion relation must be 
written in terms of a function that van- 
ishes at infinity, i.e., ye(w) = er(w)—1. 
When rewritten in terms of e,(w) as in 
eqs. (14.31) and (14.32), this is called a 
subtracted dispersion relation. Another 
example of subtracted dispersion rela- 
tion is given by the subtraction of a pole 
on the real axis, discussed in Note 6. 


8 This is particularly important for 
semiconductors, where the absence of 
conduction in the static limit is due to 
filled bands rather than to a localiza- 
tion of the electrons around individual 
atoms. 


Similarly to what we found for a(w), the condition xe(t) = 0 for t > 
0 implies that the Fourier transform xeļw) is analytic in the upper- 
half plane (apart, possibly, from poles on the real axis, that for the 
moment we assume that are not present). We can then repeat the above 
derivation and find the corresponding Kramers—Kronig relations, 


2 ii w! Xe rlw") 
Xe,R(w) = af d -a a (14.29) 
2 foe} P / 
Xew) = -2p f dus’ Xen ) (14.30) 
Tm Jo Ww — w 
In terms of the dielectric function er (w) = 1+x¢(w) these can be rewrit- 
ten as’ 
Q wer rlw") 
cae) = 1+4 =p | du! GE, (14.31) 
2w 2 , & p(w’) — 1 
€rt(w) = -7P j dw (Ra? (14.32) 


where we wrote €,(w) = er rR(w) + ier, r(w). We will see in Section 14.3 
that the extraction of the high-frequency limit from these formulas in- 
volves some subtlety. 


14.3 The Drude—Lorentz model for «(w) 


We now consider a simple classical model for the permittivity e(w) of 
a dielectric, named after Drude and Lorentz. Consider a single elec- 
tron bound to an atom or a molecule, in the presence of an external 
electric field E(t). The simplest classical description corresponds to a 
damped harmonic oscillator, with frequency wọ and damping constant 
yo. Denoting by xo(t) its position, the equation of motion is 


eE(t) 


Me 


Xo + YoXo + wExo = — (14.33) 
One should be aware that this classical description is a gross simplifi- 
cation, and the underlying description of the behavior of the electrons 
is necessarily quantum-mechanical.® Still, this simple classical model is 
useful for a first understanding. Performing the Fourier transform of 
eq. (14.33) we get 


(—w? — iwy + wi) X(w) = — (14.34) 


The dipole moment of the bound electron is d(t) = —ex(t), so d(w) = 
—ex(w) and, using eq. (14.34), 


(14.35) 


14.3. The Drude—Lorentz model for e(w) 


Denoting by n, the number of bound electrons per unit volume, the 
polarization P(t) (i.e., the electric dipole moment per unit volume) is 
then given by 


P(w) = 5 oo E(w (14.36) 
Me Wo — w? — iwyo 
Then, from eq. (14.12), 
wp 
Xelw) = = > (14.37) 
w — w? — iwyo 
where we have defined 
2 
Da Npe 
Wp izz EoMe ’ (14.38) 
and the dielectric function is given by 
e(w) wp 
r = = i } z a 
Er(w) = ae eae (14.39) 


Observe that w, has dimensions of a frequency. For reasons that will 
become clear in Section 15.3.2, it is called the plasma frequency of the 
material (even when, as in this case, the material is a dielectric rather 
than a plasma). 

A simple improvement of this model, that takes better into account, at 
least at a heuristic level, the quantum description of the bound electrons, 
is based on the fact that an electron in the ground state of an atom can 
be excited to a discrete set of energy levels, corresponding to a set of 
absorption frequencies w; and friction constants yi, with probability fi. 
We can then improve eq. (14.39), writing 


N 
D fi 
i=l t Í 


where, for definiteness, we included N oscillator levels (possibly, with 
N —> œ). The constants f; are called the “oscillator strengths” and, as 
we will see below, the Kramers—Kronig relations enforce the condition 


N 
ye 
=1 


In a classical description, one could then think to the f; as the fraction of 
electrons in the i-th state, in which case one would expect 0 < fi <1. In 
a full quantum treatment of the atom, that here we have rather modeled 
simply as a classical damped oscillator, eqs. (14.40) and (14.41) still hold, 
but the interpretation of the oscillator strengths f; is different, and they 
no longer satisfy 0 < f; < 1.9 


(14.41) 
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IIn a full quantum treatment, the os- 
cillator strengths are related to the 
transition between quantum states of 
the atom. If the atom is in a quan- 
tum state n, the oscillator strength 
fi should actually written as fin, and 
is determined by transition probabil- 
ity between the given state n and a 
generic atomic state i # n, multiplied 
by a factor that contains the energy 
difference E; — En. Therefore, “ab- 
sorption oscillator strengths,” obtained 
from transitions to states i with E; > 
En, are positive, while “emission os- 
cillator strengths” are negative. Equa- 
tion (14.41) still hold, and, in the quan- 
tum context, is called the Thomas- 
Reiche-Kuhn sum rule, although the 
standard normalization of the quantum 
oscillator strengths is such that it is 
rather written as ign fin = Z, where 
Z is the total number of electrons in the 
atoms. In the classical treatment, it is 
common to reabsorb Z in a rescaling of 
the f;, as we have done. Note, however, 
that now individual oscillator strengths 
can be negative, and as a consequence 
the sum rule (14.41) no longer enforces 
fi < 1, either. For a discussion of 
the Drude—Lorentz model in quantum 
context, see e.g., Dressel and Griiner 
(2002), in particular Section 6.1. For 
a discussion of oscillator strengths in 
a quantum context and the Thomas- 
Reiche—Kuhn sum rule, see also Chap- 
ter 10 of Rybicki and Lightman (1979). 
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Fig. 14.2 The functions er(w)/eo 
(solid line) and er(w)/eo (dashed 
line) from eqs. (14.42) and (14.43), 
as a function of w/wp. We have 
set for definiteness wo = 0.2wp and 
Yo = 0.1wo = 0.02wp. 


10 Comparing with Fig. 13.4 on 
page 357 we see that the dielectric 
permittivity of water is not reproduced 
by the Drude—Lorentz model. This 
is due to the fact that the damped 
harmonic oscillator model that we 
have considered describes the dipole 
moment induced by an external electric 
field, see eq. (14.35). In contrast, water 
molecules have a permanent electric 
dipole, i.e., are polar molecules. The 
generation of a polarization vector P is 
then due to the fact that the molecules 
align themselves with the applied 
electric field, and is not described by 
eq. (14.33). 


For bound electrons, all w; are strictly positive, and this description 
is appropriate for a dielectric. We will see later that, in a conductor, the 
contribution of the free electrons can be described by the same expres- 
sion, with the addition of electrons in a state with w; = 0. 

It is instructive to check that these results satisfy the general prop- 
erties discussed in Sections 14.1 and 14.2. We use for simplicity the 
expression for e(w) given in eq. (14.39), but this could be repeated for 
eq. (14.40). Separating the real and imaginary parts, we get (for w real) 


_ ERW) 2 wo — w? 
ig SE -= 14 14.42 
Er R co Wp (w2 — w2)? es we? ( ) 
er(w) 5 w 
r= 14.43 
T eo PO uh uP wap ma 
We see that er, p(w) = cer, R(—w) and er r(w) = —e,,7(—w), as required by 


the reality of e,(t). Note that e rR(w) — 1, and therefore Xe p(w), can 
be either positive or negative. However, from eq. (14.39), in the static 
limit 

2 


w 
é-=14+4>1, (14.44) 
w 
0 
where er = e(w = 0)/co. A plot of er r(w) and er z(w) is shown in 
Fig. 14.2.10 
The poles of y-(w) in the complex plane are given by the solution of 


the equation w? + iwyo — wi = 0, i.e., by 


wr = 


(14.45) 


If wo > yo/2, the square root root is real. In any case, there are two poles, 
both in the lower half-plane (even when wọ < 7/2). Therefore y.(w) 
is analytic in the upper half-plane. As discussed in Section 14.2, these 
analyticity properties in the complex w-plane are related to causality: 
consider the electric susceptibility in the time domain, 


t dw ; 
elt) = fF xelw)e™". 


— 0o 


(14.46) 


Note that the integral converges at w — +00, since at large |w| we have 
Xe(w) x 1/w?, see eq. (14.37), and Xe(w) has no singularity on the real 
axis, so the integral is well defined. For t < 0 we can close the contour in 
the upper half-plane, since the factor e~*@rtiwnt = e~iwrtewrt ensures 
the vanishing of the integral on the semi-circle at infinity when wy —> +00 
andt < 0. Then, since y.(w) is analytic in the upper half-plane, x(t) = 0 
for t < 0, as required by causality. For t > 0, closing the contour in the 
lower half plane and picking the residue of the two poles, we find that 
(for wo > 0/2) Xe(t) is a (real) superposition of terms proportional to 
e +t and e~’”-*, i.e., is proportional to 


XE (E) x etv 49—(0/2)? o—(70/2)E (14.47) 
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From eq. (14.28), the memory of the past behavior of E(t) therefore 
vanishes exponentially, with a decay time T = 2/70. More generally, in 
the model (14.40), the i-th term in the sum contributes with a decay 
time 7; = 2/7. 
From the explicit result (14.42), (14.43) we can extract the high- 
frequency limit of the dielectric function, 
2 
w 
€r,R(W) -1 = = , (14.48) 
i w 
2 
Err (w) = “po ; (14.49) UThe proof is non-trivial. Using 
w3 eq. (14.42), the integral in eq. (14.51) 
so, to lowest non-trivial order, ¢€,(w) — 1 is real, and epee ski 
l—u 
2 I(uo) = / du F 
w 1 1 —w2)2 242 
elw) =1- 24 o( :) (14.50) eee egy 
Ww W where u = w'/wo and uo = Yo/wo. 


It is interesting to compare these asymptotic behaviors to those that can 
be extracted from the Kramers-Kronig relations (14.31, 14.32). In the 
large w limit, eq. (14.32) gives 


enue) = [del fenn(w!)-11+0 (4), 


— 
al (w = 00), 


(14.51) 
where we assumed that e, p(w’) — 1 goes to zero sufficiently fast at 
large w’, so that the integral in eq. (14.51) converges, and we could then 
replace 1/(w’*—w?) by —1/w? inside the integral. Apparently, this seems 
to indicate that e, z(w) goes as 1/w at large w while, from the explicit 
result (14.49), we know that it rather goes as 1/w?. The solution of 
this apparent discrepancy is that, for the function (14.42), the integral 
in eq. (14.51) not only converges, but in fact even vanishes, leaving the 
O(1/w*) term as the leading one.'! So, also from the Kramers—Kronig 
relations we find that, for large w, the leading term in ez(w) is of order 
1/w?. Then, eq. (14.31) shows that 

2 co 


Er rR(w)— 1 ~ ——> 


— 
Tw ò (w oo) , 


dw' w'er rlw"), (14.52) 
where, thanks to the 1/w’? asymptotic behavior of €7(w’), the integral 
converges at large w’, so, in eq. (14.31), in the large w limit we could 
replace 1/(w’'? — w?) by —1/w?. This result is in agreement with the 
explicit result (14.48). Inserting the asymptotic behavior (14.48) into the 
left-hand side of eq. (14.52) we see that, independently of the damping 
details, and therefore of the precise form of the function ¢,,7(w), we have 
the sum rule 


(14.53) 


This is known as the f-sum rule. If we apply it to the more complicated 
model (14.40), the f-sum rule enforces the condition or f, = 1.” 
Observe that, for eq. (14.39), the Kramers—Kronig relations are auto- 
matically satisfied, as we saw above. This follows from the fact that 


At first sight, it is not at all obvious 
that this integral vanishes, furthermore 
for all values of ug. Actually, a direct 
computation of the primitive of the in- 
tegrand it is quite involved, and it is 
much simpler to show that the contri- 
bution to the integral from the integra- 
tion region 0 < u < 1, where the in- 
tegrand is positive, cancels against the 
contribution from 1 < u < oo, where it 
is negative. To this purpose, we write 
I(uo) = Iı (uo) + I2(uo), where 


1 
I = fg ; 
1 (uo) f y (1 — u2)? + užu? 


co 
I = du ——— ~. 
2(uo) | “ (u2)? 4 ade? 


In J; we make the transformation of 
variables u = tanh, which gives 


oS) 1 
hia) = | dn 
0 


1+ ue sinh? n cosh? n ` 


In Ig we write u = 1/tanhņ, and we 
get I2 = —Ih, so, indeed, Iı + I2 = 0 
for all uo. 


12 Again, the proof is not completely 
obvious. Inserting eq. (14.43) into 
eq. (14.53) we get 


2 


T 2 4 u 
-= i du ——— s, 
2 Daf (u? — u?)? +u? 


where u = w/yi and u; = wi/yi. De- 
spite its appearance, the integral is ac- 
tually independent of u;, and equal to 
its value in u; = 0, which is 7/2. This 
can be shown (with some manipula- 
tions) using formula 2.161.3 of Grad- 
shteyn and Ryzhik (1980), and can also 
be directly checked numerically. 
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13 More precisely, after performing the 
Fourier transform as in eq. (14.57), we 
assume that, at the frequency w of in- 
terest, the Fourier mode E(w) can be 
taken to be spatially constant. We will 
come back to this assumption at the 
end of Section 15.3.1, when we discuss 
the anomalous skin effect. 


eq. (14.39) has been obtained from an explicit causal model, expressed 
by eq. (14.33), and therefore the constraints due to causality are au- 
tomatically satisfied. In contrast, in our classical context, eq. (14.40) 
is simply an ansatz, not derived by an explicit causal equation of mo- 
tion, and therefore the Kramers—Kronig relations provide a non-trivial 
constraint. 

Some aspects of these results are more general than the specific Drude— 
Lorentz model that we have discussed. In particular, the fact that, 
to lowest order, the asymptotic behavior (14.50) is independent of yo 
follows from the fact that, at sufficiently large frequencies, the details of 
the damping mechanism, represented by the term —iwyo in eq. (14.34), 
become irrelevant, since in any case the inner dynamics of the atom or 
molecule cannot follow the fast oscillations of the external field. 


14.4 The Drude model of conductivity 


We next consider a simple microscopic model of the conductivity o(w), 
the Drude model (or Drude-Sommerfeld model). Again, the model is 
purely classical, and treats the free electrons of a conductor as a non- 
interacting gas of classical electrons. A more realistic treatment would 
involve quantum mechanics and considerations of the band structure of 
the solid, and rather belongs to a condensed matter course. However, 
just as for the Drude—Lorentz model of e(w), the simple classical de- 
scription that we present here is already useful for a first understanding. 
The basic physics is quite simple. The free electrons inside a material 
are accelerated by the external electric field. The collisions with the ions 
of the material, however, provide a friction term that opposes the ac- 
celeration. In an average sense, the equation of motion of an electron is 
then given by eq. (14.33), where now wo = 0, since there is no restoring 
force. The friction constant yo is written as 1/7, and 7 is identified with 
the typical time between collisions. Then, the average velocity v(t) of 
the electrons is governed by the equation 


OVE) Dee, (14.54) 
dt T Me 
We write 
Jtree(t) E —en sv (t) ) (14.55) 


where np is the number density of free electrons (with charge —e). From 
eq. (14.54) it then follows that 


dje 1 Š n e? 
E + zjtree(t) = j 


We have also assumed that the electric field is spatially constant over 
the relevant length-scale, which in this case is the mean free path of the 
electrons between collisions. Performing the Fourier transform, this 
gives 


E(t). (14.56) 


Me 


7 ly nre? 
—iWjtree(w) T cdtree(w) = f E(w), (14.57) 


14.4 The Drude model of conductivity 375 


(14.58) 


Defining, as in eq. (13.82), co = o(w = 0), eq. (14.58) can also be 


rewritten as 
90 


a(w) = I iur’ (14.59) 
where 3 
oo = ET, (14.60) 
Me 
Separating the real and imaginary parts (for w real), we get 
1 
or(w) = oo IFZ’ (14.61) 
or(w) = au (14.62) 


We can again check that the general properties discussed in Section 14.1 
are satisfied. The reality condition (14.3) is obviously satisfied, and 
also the condition (14.8), which is a consequence of the second law of 
thermodynamics. Let us now check the Kramers—Kronig relations. We 
start from eq. (14.24), that we rewrite in the form [compare with the 
third line in eq. (14.22)] 


o1(w) 


II 
| 
| = 
ac) 
oo 
8 
a 
E 
q 
p 
& 
oS 
m = 
€ 
€ 
+|= 
€ 
eed 


00 R: a 1 1 1 
——P . (14. 
T f a 1 + w?r? (= =) (iaa 


We take for definiteness w > 0. Then, the second term in the parenthesis 
has no pole, so there the principal value prescription is unnecessary. 
Introducing u = w'r and uo = wT, we must then compute 


3 du’ 1 1 
P 
o 1+wr? w -w wtw 
o T du 1 a du 1 
7 o 1+u? u-— uo o 1+u? u+uo 
f ~ du 1 f du 1 
e>0+ \Jo 1+u? u— uo uo+e | +U? u -— uo 


% q 1 
| 2- (14.64) 
o Lt+u* u+uo 


where, in the last equality, we have used the definition of principal value. 
These integrals can be computed analytically,!4 and the result is 


ie des! 1 A 
mT Jo 1ltw?r? \ol—-w wtw) 


WT 


Ge es 


14The explicit computation is as fol- 
lows. For the first two integrals, tak- 
ing into account that uo > 0, €e > 0 
and e < uo (since we want to eventu- 
ally take the limit e > 0+ with uo a 
fixed positive constant), we have 


“ore du 1 fi(uo, ) 
i T+u2u—u 2(1+ ua)’ 
and 

ee du 1 _ f2(uo,e) 

(ae u— uo (1+ u2)’ 

where 


fi(uo,€) = log [1 + (uo — €)°] 

+2uo tan! (uo — €) — 2log e + 2log uo , 
f2(u0,€) = log [1 + (uo + €)°] 

+2uo tan™! (uo + €) — 2log € — muo . 


As expected, as e — 0+, the two inte- 
grals are separately divergent, as log €, 
corresponding to the fact that, as € > 
O+, the integration region approaches 
a pole singularity 1/(u — uo). However, 
their sum has a finite limit, equal to 


2 log uo + Tuo 
2(1 + ua) 
The third integral in eq. (14.64) is also 


elementary and requires no regulariza- 
tion, 


T du 1 _ —2log uo + Tuo 
o 1l+u2 utuo 2(1 + ud) 
Then, putting everything, we get 


eq. (14.65). 
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BY 


Fig. 14.3 The integration contour in 
the complex w plane for t > 0. The 
black dot denotes the position of the 
pole. 


Inserting this into eq. (14.63) we indeed obtain a result for o7(w) that 
agrees with that given in eq. (14.62), confirming that the Kramers- 
Kronig relation (14.24) is satisfied, as it should. We can similarly check 
eq. (14.23). 

Several other interesting properties can be explicitly checked from the 
explicit expression (14.58). In particular we see that, in the complex 
w plane, the function o(w) has a single pole at w = —i/r. This pole 
is in the lower half-plane and therefore o(w) is analytic in the upper 
half-plane, as we expected from the general discussion in Section 14.2. 
Let us then compute the function o(t) that enters in eq. (14.13). We 


write 
+00 
d 
olt) = J Fe owe 
to dw 1 wi 
= — ee 14. 
70 J. 27 l- iwr ipo) 


Writing w = wp + iwr, we have e~! = e~R*te“1', Therefore, for t < 0, 
we can close the contour in the upper half-plane w; > 0, where there is 
no singularity, and the integral vanishes. Therefore a(t) = 0 for t < 0, in 
agreement with eq. (14.14). For t > 0 we close instead the contour in the 
lower half-plane. Picking the residue at the pole according to the Cauchy 
theorem (and taking into account a minus sign because the integration 
contour runs clockwise, see Fig. 14.3), and writing oo explicitly as in 
eq. (14.60), 


2 
nfe T : 1 1 iwut 
= ——— -2 =—i/r r - 
i Me (=ar Res asiy È (—ir)(w +i/T) 
2 
= ee, (14.67) 
Me 
Then, eq. (14.15) becomes 
2 pt 
jneo(t) = —£ J dt! e7 ¢#)/" Bye!) , (14.68) 
Ma pares 


This shows that, in the Drude model, the value of the current at time t is 
determined by the value of the electric field at all times t < t, weighted 
with an exponential factor that, basically, tells us that the correlation 
takes place only over a time interval t — t of order r. The value of 
the electric field at very early times t’, such that t— t >> 7, has an 
effect that is exponentially suppressed. In other words, the system has a 
“memory” of order 7, i.e., of order of the typical collision time. This is 
completely analogous to the result that we found in eq. (14.47) for the 
electric susceptibility. 

Finally, we can study the high-frequency limit. From eqs. (14.61) and 
(14.62), in the limit wr > 1, 


o 
orlw) > Ta ; (14.69) 
orlw) > a (14.70) 


WT 
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Therefore, both the real and imaginary parts go to zero, as expected 
from the general argument discussed at the end of Section 14.1, but oR 
goes to zero faster than øz, so that in the high-frequency limit the func- 
tion a(w) becomes purely imaginary. Note also that, using eq. (14.60), in 
eq. (14.70) T cancels. This signals that, in the high-frequency limit, the 
details of the relaxation mechanisms are no longer relevant, to O(1/w). 
This is due to the fact that, when driven at such a high frequency, the 
motion of the electrons reverses many times before they have a chance 
to collide, and this is the dominant mechanism that suppressed the gen- 
eration of a bulk movement, that would otherwise lead to a macroscopic 


current. 
The sum rule (14.25) is also easily checked, 
P oO 
orlw) ~ — dw' oglw') (w => oo), 

TW Jo 

— 200 oR , 1. 

~ nw Jo 1 + w?r? 

Z w (14.71) 
WT 


which correctly reproduces eq. (14.70). 


14.5 The dielectric function of metals 


As already discussed in Section 13.6.2, in metals we have both free and 
bound electrons, whose contributions combine into a single response 
function e(w), the dielectric function of metals, given by eq. (13.90). To 
understand the physical meaning of this response function, we use the 
Drude-Lorentz model (14.40) for the contribution of the bound elec- 
trons, and the Drude model (14.58) for o(w). We now denote by e,(w) 
the contribution from the bound electron, that we write as 


2 N 
elu) j me ae (fo)i (14.72) 


2_o7)2 4 ie 
€0 €9Me < 1 Mi W WwyYi 
i= 


where, for reasons that will become clear soon, we have now denoted by 
(fo); the quantities that were denoted by f; in eq. (14.40), and we used 
the explicit expression n,e?/egme for the quantity that was denoted as 
w2 in eq. (14.38). Note that 


N 
N=. (14.73) 
i=1 

This gives, for the full dielectric function e, (w) = e(w)/co of a metal, 


nje? 1 i 2 
1 T 


(fo)i (14.74) 
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where nf and n, are the number densities of free and bound electrons, 
respectively. We now introduce the total number density of electrons, 
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1511n eq. (14.80), for w Æ 0 the contri- 
bution to the integral proportional to 
(o0/€0) vanishes, using the definition of 
principal value of the integral, 

co dw’ 


0 w!2 — w2 


W—eE co du)’ 
= lim +f —— 
eat (/ ~) w — w? 


P 


K 
= zg eee 
ww! —W W—e ww! —W co 
log | — + log | — 
w +w w Pw lupe 


This term must, however, be kept to 
get the correct limit for w — 0 since, ac- 
cording to eq. (14.81), wez (w) — (a0/€0) 
vanishes as wf as w — 0, ensuring that 
the integral in eq. (14.80) is well defined 
also for w — 0 [compare with the dis- 
cussion below eq. (14.26) for a(w)]. 


n = nf + np, while the plasma frequency, in the presence of both free 
and bound electrons, is defined as 


4 ne 
Wp E EoMe : (14.75) 
We also define fo = ny/n, fi = (no/n)(fo); and yo = 1/7. Then, 
eq. (14.74) can be rewritten as 
we fo 
r = l= P 
ae) a + Wp Con w? -5 iwyi 
= 1+ Si (14.76) 


-5 — iwyi 


where, in the last line, we included in the sum the index i = 0, with 
Wi=0 = 0 and Yi=0 = Yo = 1/r. Note also that 


N N 
VA = fot doh 
i=0 i=1 
N 
= 24° L= (14.77) 


We see that the contribution of the free electrons to e(w) is formally the 
same as the contribution of bound electrons with characteristic frequency 
wi = 0, corresponding to the fact that, for free electrons, there is no 
restoring force. 
Observe that, because of the factor 

for metals the function €,(w), beside a pole in the lower half-plane at 
, also has a pole on the real axis at w = 0. As we see from 
eq. (13.90), this holds in general, simply because, for a metal, the d.c. 
conductivity o9 Æ 0, and is not specific to the model (14.76). It is 
therefore an example of the situation, discussed in Note 6 on page 370, 
in which the Kramers—Kronig relations hold for the function from which 


the pole on the real axis has been subtracted. To subtract the pole, we 
must consider the function 


1 1 1 
= j, 14. 
w2 + iwr! 7 G w +iTT oe) 


w = —ir l! 


g(w) = €,(w) — —. (14.79) 


Writing g(w) = gr(w) + igr(w), we have gr(w) = er ,r(w) and gr(w) = 
€r,1(w)—(a9/eqw). Then, the derivation of the Kramers—Kronig relations 


(14.31) and (14.32) goes through, with e,(w) replaced by g(w), and we 
obtain’? 
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Sy Erw") = (o/€0) 
Erw) = 1+ = [ des! ae , (14.80) 
To 2w €r, p(w’) — 1 


For metals, it is often a good first approximation to neglect the contri- 
bution of the bound electrons, setting na = 0, i.e., fo = 1 and f; = 0 
for i > 1. The plasma frequency is then given in terms of the number 
density of free electrons, 
ae (14.82) 
j : . 


EoMe 


In this case, from eq. (14.60), we can also write the plasma frequency in 
terms of the conductivity at zero frequency co = o(w = 0) and of 7, as 
2_ 70 


Wp = —. 
EOT 


(14.83) 
In this limit, we also find convenient to introduce the notation!® 


1 
Wy ==: (14.84) 
T 
Then, setting fo = 1 and f; = 0 for i > 1, eq. (14.76) can be rewritten 
as 


— p 

€r (w) =1 ae (14.85) 
The relation between wp and Yp allows us to classify the materials as 
good or bad conductors. As we will see in Section 15.3.2, for w,7 > 
1, i.e., Wp > Jp, a spatial distribution of electrons perturbed from its 
uniform equilibrium state can oscillates freely for many cycles, with little 
damping. Therefore, these materials are good conductors. In contrast, 
for wp X Yp we have a poor conductor.” 

The model (14.76), that assumed a non-interacting gas of classical 
electrons, can be improved including interactions and a quantum me- 
chanical treatment. Several of these effects, such as the effect of the 
band structure of the material (i.e., the interaction of the electrons with 
the lattice of ions), as well as the interaction of the electrons among them 
and with the coherent vibration of the lattice (described, at the quantum 
level, by phonons), can be modeled by replacing the electron mass Me 
in eq. (14.82) by an effective mass m,. This leaves the functional form 
(14.76) unchanged (if we neglect a frequency dependence that actually 
appears in mą), although the determination of wp from the properties 
of the material is now more complicated, and wp does not depend only 
on the number density n of the electrons. Furthermore, the net contri- 
bution from the positive ion cores is modeled, phenomenologically, with 
a constant €.., so that eq. (14.76) becomes 


(14.87) 


16The subscript “p” in yp (which is 
more commonly denoted simply as y), 
stresses its analogy with wp, since they 
are both properties of the metal, or, as 
in Section 15.3.2, of a plasma. These 
quantities are strictly related, and can 
appear as the real and imaginary parts 
of a complex frequency, in the combi- 
nation +wp — i(yp/2), see eq. (15.56). 


Writing eq. (14.83) in the form 
Wp(WpT) = o0/eo we see that the con- 
dition wpT > 1 is equivalent to 


(14.86) 
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in Figs. 14.4 and 14.5 we used wp = 

13.8 x 1015 s7}, yo = 0.0058 x 1015 s71, 

fo = 0.760 and, fori =1,...5, 

fi = {0.024, 0.010, 0.071, 0.601, 4.384} 

wi = {0.630, 1.261, 4.510, 6.538, 20.234} 
x10! s71, 

Vi = {0.366, 0.524, 1.322, 3.789, 3.363} 
x10! smt, 


and €œ = 1.2, from Rakić et al. (1998). 


Fig. 14.5 As in Fig. 14.4, on a much 
larger range of frequencies, and on 
a log-log scale. Note that now, be- 
cause of the log-log scale, we plot 
ler(w)|; er(w) is negative at small w, 
and changes sign in correspondence 
of the downward spikes. The first 
spike is made of two very close spikes 
not distinguishable in the figure, so, 
€r(w) has five zeros, correspond- 
ing to the five frequencies w; that 
we have included; including higher 
modes would produce more struc- 
ture and further downward spikes 
at higher frequencies. At large w, 
€r(wW) > €oo > 0. In contrast, er(w) 
is always positive. 


w [s4] 


Fig. 14.4 The prediction of the Drude—Lorentz model for the real and imag- 
inary parts of e(w) for gold, using a logarithmic scale on the horizontal axis 
and a linear scale on the vertical axis, and emphasizing a region of frequencies 
of order wp. 


For the free electron gas €œ = 1 while, for typical metals, €œœ œ 1 — 10. 
This reflects the fact that the natural oscillation frequencies of the ions 
are much larger than that of the bound electrons and, at least at optical 
and UV frequencies, we are not yet in the limit of w much larger than 
all natural frequencies w; of the system; the constant €> takes into 
account, phenomenologically, the effect of these high-frequency modes. 
Separating the real and imaginary parts, we get 


wp fo fiw? - w?) 
= Pp 2 t a 
€rR(W) = Eo — we +wp J CTE (14.88) 
i=1 1 i 
wero fo a firi 
P 2 iYi 
; = e _ . (14.89 
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Observe that, because of the contribution of the free electrons, the imag- 
inary part €,7(w) diverges as 1/w as w — 0. The real part, in contrast, 
saturates to a constant; in particular, if yo < wp, as is typically the case, 
the contribution from the free electrons to the real part saturates to the 
value — fo(w,/o)”, which is negative, and large in absolute value. At 
high frequencies, €, z(w) vanishes as 1/w?, and €,(w) becomes real. In 
Fig. 14.4 we show the prediction of the Drude—Lorentz model for e, p(w) 
and €,,7(w), using the values of wp, fo, yo, and the first five values of fi, 
wi and q; in the case of a metal such as gold,'® on a log-linear scale 
and a relatively small a range of frequencies of the order of the plasma 
frequency, to emphasize the structures in this range of frequencies. In 
Fig. 14.5 we show the result on a much larger frequency range and a 
log-log scale that emphasizes the asymptotic behaviors. 


Electromagnetic waves in 
material media 


In Chapter 9 we studied electromagnetic waves propagating in vac- 
uum. In this chapter we study how electromagnetic waves are affected 
by the properties of the medium in which the propagate, described 
by frequency-dependent constitutive relations such as those studied in 
Chapter 14, or by non-trivial boundary conditions, as in the case of 
electromagnetic waves propagating in waveguides. 


15.1 Electromagnetic waves in dielectrics 


We first consider the propagation of electromagnetic waves in dielectric 
materials. Since we want to consider generic dispersive media, we look 
for monochromatic wave solutions of the form 


E(t,x) = E(w, k)e"****,  B(t, x) = Bw, ke ®* (15.1) 
and we write 
D(w,k) = e(w)EWw,k),  B(w,k) = p(w) (w, k), (15.2) 


where e€ and p are assumed to depend on w but not on k.! We admit 
a priori that both w and k in the ansatz (15.1) could be complex. A 
frequency w = wp + iwz, with imaginary part wr < 0, corresponds to an 
excitation that has been set up at an initial time, and then decreases ex- 
ponentially in time, while w; > 0 corresponds to a solution that increases 
exponentially in time, which can be obtained if the medium pumps en- 
ergy into the electromagnetic wave, leading to an amplification.” 

A complex k, in contrast, corresponds to a solution that decreases 
or increases exponentially in space, from a given surface that will be 
typically a boundary of the material, where boundary conditions are 
set. The appropriate solution will then be selected by imposing suitable 
boundary conditions. For instance, in a semi-infinite medium which 
extends from x = 0 to x = ov, for a wave propagating in the positive x 
direction we would require that the solution goes to zero as x — +00, 
which selects the decreasing solution. Note that, at this stage, w and k 
are independent variables, that enter the ansatz (15.1). The dispersion 
relation of the electromagnetic waves in the material, i.e., the relation 
between w and k, will be obtained by requiring that the ansatz indeed 
satisfies Maxwell’s equations. 
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TA further generalization of the consti- 
tutive relations that we have studied in 
Chapter 14 can be obtained admitting 
also a k dependence in the functions 
describing the response of the material, 
e.g., writing D(w, k) = e(w,k)E(w,k). 


2 As we discussed in Note 9 on page 371, 
in the context of the quantum Drude— 
Lorentz model, the oscillator strengths 
fi that appear in eq. (14.40) can be 
positive or negative, and negative val- 
ues correspond to “emission oscillator 
strengths”, where an excited electron 
makes a transition to a lower energy 
state. This leads to pumping energy 
into the electromagnetic wave; a partic- 
ularly important example is the mech- 
anism that gives rise to lasers. 
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3Note that in this section k will denote 
the modulus of the three-dimensional 
vector k, so k? = |k|? [or, more pre- 
cisely, for complex k, k? = (kp + 
ikr) (kr + iky)]; in contrast, in a rel- 
ativistic context, such as in Chapter 9, 
we used the notation k? for kuk”. 


4We assume that, for each k, the solu- 
tion for w exists (otherwise there is no 
plane wave solution) and is unique. If 
there are several solutions, one can ap- 
ply the analysis that follows to each of 
them. 


For dielectrics, we also set to zero the density of free charges and cur- 
rents. The computation is then quite similar to that of the propagation 
of electromagnetic waves in vacuum studied in Chapter 9. Inserting the 
ansatz (15.1) into Maxwell’s equations in material media, eqs. (13.45)— 
(13.48), we get 


e(w)kE(w,k) = 0, (15.3) 
kx B(w,k) = 2) i,k), (15.4) 
k-B(w,k) = 0, (15.5) 
kx E(w,k) = wB(w,k), (15.6) 


where we have defined the refraction index n(w) from 


(15.7) 


n(w) = cv/e(w) p(w) . 


Note that n(w) is dimensionless; in vacuum, e(w)u(w) becomes ego, 
which is equal to 1/c? [recall eq. (8.27)], and therefore n(w) = 1. Using 
coHo = 1/c? we can also rewrite eq. (15.7) as 


n? (w) = e(w)u(w) 
E0 Ho 


= €r (w)ur (w) ; 


where, as usual, €,(w) = e(w)/e€9 and p,(w) = u(w)/Ho. 

For the study of electromagnetic waves in dielectrics, we can as- 
sume that e(w) never vanishes, as we will discuss in more detail below 
eq. (15.58). Therefore, eq. (15.3) is equivalent to k-E(w,k) = 0. Taking 
the vector product of eq. (15.6) with k [which, in coordinate space, cor- 
responds to taking its curl, as we did in the derivation of eqs. (9.73) and 
(9.74)] and using eq. (1.9), together with k-E(w,k) = 0 and eq. (15.4), 


(15.8) 


we get 
—k’E(w,k) = wk x B(w,k) 
2 
Ww ~ 
= -an (w)E(w,k) , (15.9) 
where k? = k-k.’ Similarly, from the curl of eq. (15.4), we get 
~ w2 ~ 
k’ B(w, k) = —-n?(w)B(w,k). (15.10) 
We therefore find the dispersion relation 
2_ Ww? 3 
k= an) (15.11) 


In vacuum, where n(w) = 1, we recover the dispersion relation (9.41). 
Since w is now implicitly fixed in terms of k by the condition (15.11), to 
label the solutions we can use just k, rather than the pair (w, k).4 We 
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then introduce the simpler notations 


Ex = E(w(k),k), By = B(w(k),k). (15.12) 


Recalling that we eventually take the real part, we can summarize the 
ansatz as 


II 


E(t, x) Re [E ea] i (15.13) 


B(t,x) = Re ae (15.14) 


where w(k) is given by eq. (15.11). Equation (15.11) is a necessary 
condition for having a solution, but it is not yet sufficient. To find the 
full set of conditions on the ansatz, we plug eqs. (15.13) and (15.14) into 
Maxwell’s equations (15.3)—(15.6). From eq. (15.3) (having assumed 
that e(w) never vanishes) and eq. (15.5) we get 


kE,=0, kB,=0. (15.15) 


This shows that, just as for the electromagnetic waves in vacuum, for 
electromagnetic waves in dielectrics E and B are orthogonal to the prop- 
agation direction. Finally, using eq. (15.11), eq. (15.6) can be rewritten 
as 

cB, = n(w) k x Ex, (15.16) 


and the same condition comes from eq. (15.4). This tells us that, just 
as in vacuum, E and cB are orthogonal to each other, but now their 
modulus is different, compare with eq. (9.51). We have therefore shown 
that the ansatz (15.13)—-(15.14) is indeed a solution of the full set of 
Maxwell’s equations, under the conditions (15.11), (15.15), and (15.16), 
and therefore dielectric materials sustain the propagation of monochro- 
matic electromagnetic waves. 

Observe that, in a dispersive medium, e(w) and ju(w) are necessarily 
complex. Indeed, from the Kramers—Kronig dispersion relation (14.31) 
we see that, if e,7(w) = 0 for all w, then err = 1 independently of 
the frequency. Therefore, a non-trivial frequency dependence of €r(w) 
implies that, at least for some frequencies, €7(w) # 0, and similarly 
for u(w). Therefore n(w) in eq. (15.7) is complex, and the dispersion 
relation (15.11) is also complex. As a consequence, eq. (15.11) can in 
general have solutions with both w and k complex, w = wr + iwr and 
k = kgr + ikz. The simplest solutions correspond to w real. In this case 
the temporal evolution is just an undamped oscillation. Solutions with 
wy > 0 disappear exponentially in time and are therefore less relevant 
to the free propagation, while solutions with wy; < 0 are exponentially 
growing and represent instabilities, where energy is pumped into the 
electromagnetic wave. In any case, since n(w) is complex, even for w 
real the corresponding value of k, obtained from eq. (15.11), is in general 
complex. We write? 


n(w) = nR(w) +inz(w), (15.17) 


51n the literature, another common no- 
tation is n(w) = n(w) + ik(w). The 
real part, nR(w) or n(w), is called the 
real refractive index, while the imagi- 
nary part, nz(w) or K(w), is also called 
the extinction coefficient. 
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and k = kpg + ikg. Recalling that k? = k-k = (kr + ikr) (kr + ikr), 
eq. (15.11) separates into two equations for the real and the imaginary 


parts, 
2 
w 
lkr? lk? = zz [nkw)—n7&)] , (15.18) 
2 
w 
krk; = za rr(w)nr(w). (15.19) 


The solution (15.13), for w real, can then be rewritten as 
E(t, x) = Re eae (15.20) 


and the same for B(t,x). The term e~*(*)!+ikr-* gives the phase of 
the field, while the term e~*'™* affects its amplitude and describes the 
attenuation of the wave as it propagates in the material. Note that the 
surfaces of constant phase are perpendicular to kg while the surfaces of 
constant amplitude are perpendicular to ky. Thus, unless kg and ky 
are parallel, these two surfaces are different. Solutions where kr and ky 
are not parallel are called inhomogeneous plane waves. 

In the simpler case of homogeneous plane waves, where kz and kz are 
parallel, we can write kg = kgk, ky = krk, and eqs. (15.18) and (15.19) 


give 
kp = =ne(w) (15.21) 
kp = ny(w). (15.22) 
C 


15.2 Phase velocity and group velocity 


We next discuss two distinct notions of velocity related to the propaga- 
tion of electromagnetic waves. We neglect absorption, so we set k; = 0, 
kr =k and k = |k|. Setting ky = 0 in eq. (15.20), we have 


E(t,x) = Re [E etelt] , (15.23) 
where the phase y is given by 
p(t, x) =w(k)t —k-x, (15.24) 


and the dependence of w on k is given by the inversion of eq. (15.11). 
The surfaces of constant phase therefore travel at the velocity vp = vpk, 
where 


Up(k) = =. (15.25) 


Using eq. (15.21) (and setting nr = n since we are neglecting all imagi- 
nary parts), the phase velocity can be written as 


Up(w) = f (15.26) 
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Usually, n(w) > 1, so vp(w) < c. In some situations, however, n(w) can 
be smaller than one. This is the case, for instance, for a dielectric de- 
scribed by the Drude-Lorenz model, where e,(w) is given by eq. (14.39). 
For a dielectric we can set p(w) = uo so, from eq. (15.8), n?2(w) = er (w). 
Therefore, in the approximation in which all imaginary parts are ne- 
glected we have n?(w) = er R(w), and we see from eq. (14.42) that this 
becomes smaller than one for w > wọ. This, however, is not in con- 
flict with the postulates of Special Relativity, because monochromatic 
waves simply do not carry information. In order to transmit informa- 
tion, we must modulate the signal by superposing plane waves into wave 
packets.” The relevant question, therefore, is whether wave packets can 
transmit information at speed greater than c. According to eq. (15.26), 
Fourier modes with different values of w travel at a different phase veloc- 
ity. To understand the consequences of this, we consider a superposition 
of plane waves with different wavenumbers k, 


3 ~ x s 
E(t, x) = | $W g Rika (15.27) 


(27)° 


where, to be more general, we have assumed that w depends not only on 
the modulus k = |k|, but on the full vector k, which could happen, in 
general, in anisotropic materials, and we take k and w(k) real. Actually, 
nothing in our considerations will depend on the vector nature of the 
electric field, and we can more simply study a relation of the form 

dk x = ik- 

sex = f oars Feito, (15.28) 

(27)? 
for some function f, which could represent a component of the electric 
field, or an actual scalar function, in which case our analysis would 
apply also to other kind of waves, such as sound waves. If the function 
f (k) is completely generic, each Fourier mode will travel at a different 
velocity, the spatial shape of the signal will be quickly distorted by 
the propagation, and there is little that can be said in full generality. 
If, however, f (k) is sharply peaked around a wavenumber ko, we can 
expand the frequency as 


Ow(k 
w(k) ~ w(ko) + (k—ko); ( - ) d 
Oks k=ko 

= wot (k-ko) (Vkw)k-ko t-> (15.29) 

where wọ = w(ko). Then 
f(t, x) = e7 Ors J ar f(k) eiro) vat) (15.30) 

f (27) ? 
where 

Vg = (Vkw)k-ko (15.31) 


is called the group velocity. For an isotropic medium, where w depends 
only on k = |k|, using eq. (1.23) with r replaced by k we see that the 


6 Actually, we see from Fig. 14.2 that for 
w just above the resonance frequency 
wo, €R(w)/eg even becomes negative. 
Note, however, that this happens pre- 
cisely when €7;(w)/eo is large and we 
cannot neglect the imaginary parts. 


TAn alternative argument is that a 
purely monochromatic plane wave is 
just a mathematical idealization. In- 
deed, from a basic properties of the 
Fourier transform, if we denote by At 
the duration of a signal and by Aw 
the spread in frequencies of its Fourier 
transform, we have AtANw 21. Any 
physical signal that we observe has a 
finite temporal duration and therefore 
its Fourier transform is necessarily non- 
vanishing over an interval of frequencies 
Aw 21/At. 
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group velocity is in the direction k of the propagation, and its modulus 


is given by 
dw(k) 


Equation (15.30) shows that, apart from an overall phase e~*¥ot+*ko-x 
(that disappears in quadratic quantities such as | f|?, so in our case in the 
energy density of the electromagnetic field, which is given by |E|?), in the 
approximation in which we stop the expansion in eq. (15.29) to linear or- 
der, the shape of the wavepacket remains the same and is just translated 
in space at a velocity vg. The group velocity is therefore the quantity 
relevant to the transport of energy by a packet of electromagnetic waves. 
Note, however, that this only holds in the approximation where f (k) is 
sharply peaked around a wavenumber ko, which justifies the expansion 
(15.29). Using k = (w/c)n(w) and writing dw/dk = (dk/dw)~!, we get 


c 
n(w) + we 


For normal dispersion we have n(w) > 1 and dn/dw > 0. Then vg(w) < 
Up(w) < c. It is possible, however, to have dn/dw negative, and large 
in absolute value. An example is given again by the Drude—Lorentz 
model (14.39), or by its generalization (14.40): as before, we write 
n (w) = € p(w). Then, from Fig. 14.2 on page 372 we see that, just 
above the resonance frequency wo, the derivative of er, R(w) becomes 
negative, and large in absolute value. This behavior is called anomalous 
dispersion and, formally, can give rise to a group velocity larger than 
c, or even negative. However, again, this simply means that, in such 
regions, the approximation (15.29) becomes invalid (furthermore, as al- 
ready mentioned in Note 6, in this regime absorption is large and we 
cannot neglect it), and the concept of group velocity loses its meaning. 
The actual evolution of a wavepacket in this regime is more involved, 
and cannot be captured, even approximately, by a single quantity such 
as a velocity. 


15.3 Electromagnetic waves in metals 


We now study electromagnetic waves in conducting materials. We start 
again from Maxwell’s equations in material media, eqs. (13.45)—(13.48). 
The difference with the treatment for dielectrics of Section 15.1 is that 
now we must include the density of free charges and currents, Pfree and 
jfree. We already worked out the corresponding expression for Maxwell’s 
equations that depends on the sources in eqs. (13.88) and (13.89), where 
e(w) is the dielectric function for metals, given by eq. (13.90). We look 
for a monochromatic plane wave solution of the form (15.1). The full 
set of Maxwell’s equations (13.88), (13.89), (13.47), and (13.48) then 
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becomes 
e(w)k-E(w,k) = 0, (15.34) 
k x B(w,k) + we(w)u(w)E(w,k) = 0, (15.35) 
k-B(w,k) = 0, (15.36) 
k x E(w,k) — wB(w,k) = 0. (15.37) 


Consider first eq. (15.34). There are two branches of solutions. One 
is obtained setting k-E(w,k) = 0. This solution is transverse to the 
propagation direction, just as we found for electromagnetic waves in vac- 
uum, as well as for electromagnetic waves in dielectrics. We will study 
this solution in Section 15.3.1. There is, however, another possibility: 
eq. (15.34) is automatically satisfied, without imposing the transver- 
sality condition on E, at a special value w of the frequency, given by 
the solution of the equation e(@) = 0. We will study this solution in 


Section 15.3.2.° 8 We will discuss after eq. (15.58) why, 
in dielectrics, the corresponding solu- 
tion obtained setting e(w) = 0 does 

15.3.1 Transverse EM waves not correspond to a propagating elec- 


tromagnetic wave. 


In this section we study the solution of eq. (15.34) obtained requir- 
ing that k-E(w,k) = 0. In this case, the electric field is transverse to 
the propagation direction. Equation (15.37) tells us that B(w,k) is or- 
thogonal to E(w,k), and eq. (15.36) tells us that it is also orthogonal 
to the propagation direction. We therefore have the same situation as 
the electromagnetic waves in vacuum, or in dielectrics, with the elec- 
tric and magnetic fields orthogonal to the propagation direction and to 
each other. The dispersion relation w(k) can be obtained combining 
Maxwell’s equations, just as we did for dielectrics. Taking the vector 
product of eq. (15.37) with k and using eqs. (1.9), (15.34), and (15.35), 
we get 


—kE(w,k) = wk x B(w,k) 
-w e(w)ulw) E(w, k). (15.38) 


Then, the dispersion relation is 


welwulw) = k? , (15.39) 


which, of course, is the same as eq. (15.11) (recalling the definition 
(15.7) of the refraction index), except that now e(w) is the dielectric 
function appropriate for a metal, given in eq. (13.90). Let us consider 
the consequences of this dispersion relation, studying first the low- and 
high-frequency limits. We limit ourselves to w real. However, since the 
function e(w) is complex, the corresponding solution for k will in general 
be complex, k = kr + ikr. 

In the low-frequency limit wr « 1, i.e., w K 1/T = Jp, eq. (13.90) 
gives e(w) ~ ioo/w, where, as usual, 09 = o(w = 0) is the (zero- 
frequency) conductivity, while u(w) ~ uw(w = 0) = u becomes the same 
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Iof course, in this setting, in which 
we are considering a wave propagat- 
ing into a metal along the positive x 
direction, we have chosen the solution 
for ky with positive imaginary part, so 
as to have an exponentially decreasing 
amplitude. The solution of eq. (15.39) 
with kp = —(oopw/2)'/? is eliminated 
by the boundary condition that the cor- 
responding solution for the electromag- 
netic wave does not diverges as x — oo. 
If we rather consider a setting with 
the metal at x < 0 and a left-moving 
wave coming from positive x, we would 
choose the other solution for ky, ky = 
—(copw/2)'/2, to ensure that the elec- 
tromagnetic wave does not diverge as 
z > —oo. 


10 For orientation, in terms of wave- 
lengths, the far UV ranges from 10 to 
200 nm, middle UV is 200-300 nm and 
near UV 300-380 nm. With longer 
wavelengths we enter the visible range, 
from violet (380-450 nm) to red (625- 
750 nm, i.e., 0.625 to 0.75 um). Then 
come the near infrared (NIR, about 
0.75 to 2.5 wm), middle infrared (MIR, 
2.5-10 ym), and far infrared (FIR, 
10 wm-1 mm). From 1 mm to about 
1 m we are in the domain of mi- 
crowaves. 


llThe unit of conductivity in the SI 
system was discussed in Note 49 on 
page 96, see in particular eq. (4.188). 


as the static permeability of the material. Then eq. (15.39) gives (taking 
the square root with positive imaginary part) 


1+2 
k ~ —— Joopw, 15.40 
V2 oH ( ) 
and therefore in this limit the imaginary part of k is 
1/2 
kr = (=) (15.41) 


Setting the propagation direction along the positive x axis, and the 
vacuum-metal interface at x = 0, at x > 0 the amplitude of the wave 
therefore decreases as e~*!® = e~*/%skin where dskin = 1 /kr is called the 
skin depth.° From eq. (15.41), 


2 


OoLwW 


1/2 
dskin(W) = ( ) (w K Yp). (15.42) 
Furthermore, for non-magnetic materials, we can approximate u ~ Ho. 
For instance, for copper, w/o ~ 0.999994. Then, writing y ~ uo = 


1/(eoc?), we get 


2€0 


Oskin(W) = c ( e ; (w <p). (15.43) 


Oow 
To get an idea of typical numbers, in copper at 20°C the collision time 
is r ~ 2.4 x 10-4 s, so Y ~ 4.2 x 10!3s7!. The condition w < 
Yp is therefore satisfied for f < yp/(27) ~ 6 THz. In terms of \ = 
c/f, this corresponds to A >> 50 um, i.e., wavelengths longer than the 
mid infrared.!? In this regime, we can apply eq. (15.43). The static 
conductivity of Cu at 20°C is o9 ~ 5.96 x 107 S/m.!! From eqs. (2.12), 
(2.13), and (4.188) we see that o9/€ 9 has dimensions of s~', i.e., has the 
same dimensions as a frequency, and, for Cu at 20°C, we get oo/e9 ~ 
6.73 x 10/8 s71. Setting for instance as a reference value f = 10 GHz, in 
the microwave region, we get 


10 GHz \ 1”? 

Oskin(w) ~ 0.65 um (255) ; (Cu, 
Also observe that, from eq. (14.83), in copper at room temperature, the 
plasma frequency is wp ~ 1.7x10'®s~!, or fp = wp/(27) ~ 2.7 x 10"? Hz. 
The corresponding wavelength is \, = 27¢/w, ~ 110 nm, in the far UV. 

We next consider the high frequency limit. There are two frequency 
scales in the problem, yp = 1/7 and wp. As mentioned below eq. (14.85), 
good conductors are characterized by the property wp > Yp (as we have 
seen above, in copper wp © 1.7 x 10'6s7! and yp ~ 4.2 x 10! s71). 
Therefore, there are two different regimes at wr > 1, namely yp & w & 
Wp and w >> wp. We will study the full behavior below. For the moment, 
we focus on the very high frequency regime w >> wp. Then, plugging 
eq. (14.85) into eq. (15.39) (with u(w) ~ uo), and keeping the leading 


W 


< 6 THz). 
2m 


(15.44) 
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correction in w2/w? both in the real and in the imaginary part of w«(w), 


we get > : 
ke =w (: si) it (15.45) 
Writing k = kp +iky, eq. (15.45) gives 
w ~ ws + ke, (15.46) 
and 
kre = X (2) (15.47) 


Equation (15.46) is the dispersion relation of transverse electromagnetic 
waves in a metal in the limit w >> wp, i.e., for kre > wp.!? Equa- 
tion (15.47) shows that, in the limit w >> wp, kre is much smaller than 
the frequency scale yp given by the inverse of the collision time, and goes 
to zero as w/w, — oo. Therefore, metals become almost transparent to 
electromagnetic waves for w much larger than their plasma frequency. 
For metals, typical values of wp are in the UV. For instance, for Cu we 
have seen that wp is in the far UV. For alkali metals the plasma frequency 
is even smaller; for instance, for thin films of Cesium the transparency 
already starts in the violet part of the visible spectrum. 

Actually, using the simple model (14.85) for the dielectric function 
of a metal (and setting p(w) = po), it is not difficult, and instructive, 
to study the dispersion relation (15.39) for w generic, rather than only 
in the limiting cases w < yp and w >> wy. Inserting eq. (14.85) into 
eq. (15.39) we get 

Be 24)? = wwe 

w? + iwyp 
Writing k = kpr + ikr and separating eq. (15.39) into its real and imagi- 
nary parts, the resulting equations can be combined into a second degree 
equation for the variable k?, whose solution is 


(15.48) 


A 1/2 
noe = gp G ab E at t a a) 
(15.49) 
The corresponding solution for kg(w) is given by!? 
2 
w 
oni = - (15.50) 


2 (w? + 72)kr(w) ` 
In Fig. 15.1 we plot (on a log-log scale) the corresponding skin depth 
dskin(w) = 1/kr(w), using the values of wp and yp appropriate for cop- 
per at room temperature. We see that there are indeed three distinct 
regimes: at w &K Yp, Oskin(w) decreases as 1/yw, in agreement with 
eq. (15.43). From eq. (15.49), in this regime 


CDBG 


w w1/2 , (w < p) 5 
p 


Oskin(w) ~ c (15.51) 


12The reader familiar with quantum 
mechanics can observe that, in quan- 
tum mechanics, the energy E of a parti- 
cle is related to its frequency by E = hw, 
and the momentum p is related to the 
(real part of the) wavenumber k by p = 
hk. Then, the above dispersion relation 
takes the form €? = m2ct +p2c?, with 
Mpc? = ħwp. This is the dispersion re- 
lation of a massive particle with mass 
Mp, see eq. (7.147). In this sense, in 
a conductor, the photon becomes mas- 
sive. 


131 the literature, eq. (15.46) is some- 
time used to argue that there is no so- 
lution for kg for w < wp. However, 
this is incorrect because eq. (15.46) only 
holds for w >> wp and, as we see from 
the exact solution (15.49, 15.50), kr (w) 
is non-vanishing for all values of w. 
Note also that the right-hand side of 
eq. (15.49) is always positive, so ky(w) 
is well defined for all values of w. 
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M4 For instance, for copper at room tem- 
perature, the electrons mean free path 
is £ ~ 4.2x 1078 m, while at \ = 10 um 
(corresponding to w ~ 1.88 x 1014 s7! 
and w/yp ~ 4.5), eq. (15.49) gives 
Oskin ~ 1.8 x 1078 m, so for these fre- 
quencies the assumption £ < dskin(w) 
breaks down. 
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Fig. 15.1 The skin depth ôskin(w) for the simple model (14.85), using the 
values of oo and 7 for copper at room temperature. The corresponding values 
of yp and wp are indicated by the horizontal dotted lines. The bands of the 
electromagnetic spectrum corresponding to the values of w are also indicated 
(the label “v” stands for “visible” ). The smaller ticks in the IR and UV regions 
correspond to the subdivision of the IR into far, middle, and near IR (from 
left to right), and similarly for the subdivision of the UV into near, middle 
and far UV (again from left to right). 


which, upon use of eq. (14.83), is the same as eq. (15.43). The solution 
for dskin(w) then flattens in the region Yp Sw Sw, and, as we approach 
the plasma frequency, it raises sharply. When w > wy, dskin(w) eventu- 
ally grows as w?, in agreement with eq. (15.47) so, for instance, metals 
can be quite transparent to X-ray radiation. Note, however, that, apart 
from the fact that have we used the very simplified model (14.85) for the 
response function, at a more fundamental level the classical analysis used 
to produce Fig. 15.1 breaks down for X-ray radiation. In this regime a 
full quantum treatment, based on (coherent and incoherent) Compton 
scattering, becomes necessary, see also the discussion in Section 16.2. 
It should be observed, at this point, that the above computation as- 
sumes the validity of Ohm’s law in the form (13.81), and of the Drude 
model of conductivity. As can be seen from the derivation in Sec- 
tion 14.4, this in turn assumes that, at the frequency w of interest, the 
electron mean free path £ inside the conductor is small compared to the 
length-scale over which the corresponding Fourier mode of the electric 
field varies, which is given precisely by the skin depth d.xin(w). This as- 
sumption indeed entered when we assumed that, in eq. (14.57), we could 
neglect the x dependence of electric field mode E(w, x). From Fig. 15.1 
we see that dskin(w) becomes quite low in the regime yp & w < w, and 
in this domain this assumption can fail even at room temperatures (and 
even more at low temperatures, where the mean free path can increase 
significantly).!4 In this case, we are in the regime of the anomalous skin 
effect. In this regime the relation between the current and the electric 
field is more complicated, and is rather given by an integral relation of 
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the form 


jeree(w, x) = f Eroto — x')E(w, x’). (15.52) 
Upon performing the Fourier transform also with respect to x, this gives 
jfree(w; k) = o(w, k)E(w,k). (15.53) 


The determination of the kernel o(w,x — x’) is then more complicated, 
and requires solving a Boltzmann kinetic equation for the non-equilibrium 
part of the electron distribution function in phase space. 

Several other approximations must be improved before comparing 
Fig. 15.1 to the behavior of an actual metal. In particular, by using 
eq. (14.85), we have neglected the contribution from bound electrons 
and, more fundamentally, all our treatment, based on the Drude model, 
has been purely classical, and has neglected the interaction between the 
electrons. A full quantum treatment, including the effect of the band 
structure of the metals, belongs to a solid-state course. However, the 
simple model that we have discussed in this section already gives a first 
useful orientation. 


15.3.2 Longitudinal EM waves and plasma 
oscillations 


We next investigate the other branch of solutions of eq. (15.34), which 
exists if there is a value © of w such that e(@) = 0. First of all, we observe 
that such a solution is physically possible. Using for instance the simple 
model (14.85) for the response function, the equation e(w) = 0 becomes 


D? + iy — we =0, (15.54) 
whose solutions are 
eee: Yp\? -p 
© = +4/w2 — (2) -iz (15.55) 


For metals wp >> Yp, so eq. (15.55) simplifies to 
a 
2 
Therefore, at this special frequency, beside the “usual” transverse elec- 


tromagnetic waves, there is also a solution of eq. (15.34) where E is 
longitudinal, 


wer Wp = 


Yp. (15.56) 


E(w = @,k) = Exk. (15.57) 
If we insert this expression into the other Maxwell’s equations we see 
that, since k x k = 0, eq. (15.37) requires that, on this solution, B = 0. 
Then, eq. (15.36) is trivially satisfied and, taking into account that 
€(@) = 0, also eq. (15.35) is satisfied. We have therefore found a longi- 
tudinal electromagnetic wave, in which the electric field has the form 


E(t, x) = e77?'"/?Re [Ex en wettikex] fe (15.58) 
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15 Actually, for wp < Yp/2, there is not 
even a real part in eq. (15.55), and the 
solution for © becomes purely imagi- 
nary. 


164 non-trivial dispersion relation 
would emerge from a hydrodynami- 
cal treatment of the oscillating elec- 
tron fluid, see Section 10.8 of Jackson 
(1998), and leads to a dispersion rela- 
tion w?(k) = we+3(v?)k?, where (v?) is 
the average square velocity of the elec- 
trons, related to the temperature by 
melv?) = 3kT. 


17 Recall that, in this computation, we 
have also set p(w) = po. 


i.e., performs oscillations at the plasma frequency, with an amplitude 
that is exponentially damped in time (since 7, = 1/7 > 0), and is ori- 
ented in the propagation direction k, while B = 0. Metals are charac- 
terized by the condition wp >> yp. Therefore, the wave (15.58) performs 
a large number of oscillations before being significantly damped. 

One might ask why we did not consider this solution for dielectrics. 
Again, one would have a longitudinal solution if there were a frequency 
© such that «(@) = 0 in eq. (15.3). If we use the model (14.39) for 
the dielectric constant, the equation «(@) = 0 becomes the same as 
eq. (15.54), with w2 replaced by wĝ+w2 and yp replaced by yo. However, 
while a good conductor is characterized by the condition wp >> Yp, for 
a dielectric or a poor conductor we are rather in the opposite limit, in 
which wp is at most of the same order as yp (and the typical frequency 
wo for an electron bound in an atom is also in general in the UV, of the 
same order as wp). Any wave oscillating as in eq. (15.58) would therefore 
get damped on a timescale of at most a few cycles of oscillations, and 
therefore does not describe an actual wave propagating in the medium.!° 

Note that the Fourier modes E(w = @,k), in eq. (15.58), all oscillate 
at the same plasma frequency w, (and have the same decay time), inde- 
pendently of k, so the dispersion relation of these modes is w(k) = wp, 
independent of k. This is due to our simplified modeling of the response 
functions o(w), which we have taken independent of k.1® 

We now discuss the physics behind the longitudinal solution. From 
eq. (14.82) we see that, if ny = 0, then wp = 0 and there are no oscilla- 
tions. These oscillations must therefore be related to oscillations of the 
free charges in the medium, and disappear if there are no free charges. 
Note that this is different from what happens to the transverse elec- 
tromagnetic waves that, according to eq. (15.39), in the limit ny = 0, 
where e(w) = €o, have the standard dispersion relation of electromag- 
netic waves in vacuum, w = |k|c, see eq. (9.41).1” 

Consider a metal, in which the positively charged ions are, to a first 
approximation, fixed, while the electrons are free to move. In an equi- 
librium situation, the macroscopic charge density pions(x) of the ions 
and the macroscopic charge density p(x) of the free electrons are time- 
independent, and are equal and opposite at each (coarse-grained) point 
in space, Pions(X) = — pf (xX), so the medium is overall electrically neutral. 
Suppose now that, either because of statistical fluctuations or because 
of an external disturbance, some electrons are removed from a region 
and accumulate into another. Then, the electrons density is lowered 
in the first region and correspondingly enhanced in the second region. 
The first region will therefore be overall positively charged, because the 
positive ion charge density will no longer be fully compensated by the 
electron charge density, while the second will have an overall negative 
charge. As a result, there will be a net electric field, pointing from the 
positively charged region toward the negatively charged region. Under 
the action of this electric field, all the free electrons of the material will 
move, and in particular the “cloud” of excess electrons will move back 
toward the region where there is a net positive charge. If collisions with 
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the ions are negligible, it will arrive there with a significant velocity and 
will swing on the opposite side. The force from the positively charged 
region will then attract it back, so this cloud of electrons performs a 
series of oscillations around the fixed position of the positively charged 
region. Eventually, these oscillations will be damped by the collisions. 
The electric field generated by these charge inhomogeneities points from 
the positively charged region toward the electron cloud and is therefore 
aligned with the direction of motion of the electron cloud (and opposite 
to it). It is therefore a longitudinal electric field, that points in the same 
direction as that along which there is a spatial inhomogeneities, i.e., 
along the direction of the wavenumber k of the Fourier mode, and oscil- 
lates together with the oscillations of the free electron, passing through 
zero at the moment when the electron cloud passes over the position of 
the positively charged region, since there the total charge densities of 
electrons and ions momentarily compensate each other. We therefore 
have a natural mechanism for creating an oscillating longitudinal elec- 
tric field. The same can happen in a plasma, where the positive and 
negative charges can both move freely. In this case, a cloud of excess 
positive charges and a cloud of excess negative charges would oscillate 
around their common center of mass. These oscillations are then called 
plasma oscillations (even in the case of metals). 

We now compute the frequency of these oscillations, using the Drude 
model, and we will show that we get precisely the plasma frequency 
computed above. We consider for definiteness the situation appropriate 
to a metal, where the ions are fixed and the free charges are provided 
only be the free electrons, but the argument can be generalized to the 
case of a plasma, where both the ions and electrons are free to move. 
We start from eq. (14.56), that we recall here, 


Ot r Me 


2 
(5 es *) jtree(t, x) = ZE EG, x), (15.59) 


[where, compared to eq. (15.59), we have written Ôjfree/Ot instead of 
djfree/dt since now we are considering an inhomogeneous situation where 
jtree = jfree(t, X)| and we observe that, in this case, E is not a fixed ex- 
ternal electric field, but rather is generated by the positive and negative 
charges in the metal. It therefore satisfies 


V-E(t,x) = EES (15.60) 
0 

where p(t,x) = Pions(X) + Ptree(t,x); note that we have taken a static 
distribution of ions, as appropriate to a metal, while the electrons pro- 
vide the freely moving charges and, in this out-of-equilibrium situation, 
have a time-dependent density. In a homogeneous equilibrium situa- 
tion, the positive and negative charge densities are independent of time, 
and compensate each other, so the total density vanishes. However, in 
the presence of spatial fluctuations, the two densities no longer com- 
pensate each other at each point, even if the total charges, i.e., their 
spatial integrals, are equal and opposite. The charge density and the 
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18ty radio-frequency engineering, mi- 
crowaves are rather defined, more re- 
strictively, as electromagnetic waves 
with wavelength between 3 mm to 
0.3 m, corresponding to the range 1- 
100 GHz. 


current density of the free electrons are related by the continuity equa- 
tion V-jfree = —OPptree/Ot. However, since pions is independent of time, 
we can write it as well as V-jmee = —Op/Ot. Taking the divergence of 
eq. (15.59), we therefore get 


ð 1\ Op _ nje 
(3 2) ðt eome d oo 


On the right-hand side we recognize the square of the plasma frequency, 
eq. (14.82), so we can rewrite eq. (15.61) as 


oe Le 
(sa a +a) p=0. (15.62) 


Notice that, in our approximations, this equation is independent of x, 
i.e., is the same for all spatial Fourier modes. Looking for a solution 


p(t) x e~*”* we get the condition on a, 
o? +i —w? =0, (15.63) 
T 


which is the same as eq. (15.54) (identifying as usual yp = 1/7), and 
therefore has the same solutions, given in eq. (15.55). We have therefore 
understood that the longitudinal solution of Maxwell’s equations in a 
conductor, found above, is due to the damped oscillations of the charge 
inhomogeneities. 

It could be puzzling the fact that, in plasma oscillations, no magnetic 
field is generated, since the oscillating electron cloud generates a current. 
However, from the Ampére—Maxwell law in materials, eq. (13.46), we 
know that there are two contributions to the magnetic field, one coming 
from the time derivative of D and the other from the current of the 
free charges. In Fourier space, these two contributions are given by the 
terms proportional to D(w,x) and to ines (w, x) in eq. (13.85), and they 
combine to give the term proportional to e(w)E(w,x) in eq. (13.89). We 
see that the condition e(w) = 0 imposes that these two contributions 
cancel among them, and therefore there is no magnetic field. 


15.4 Electromagnetic waves in waveguides 


The range of wavelengths of microwaves is about A ~ 1 mm to 1 m, 
corresponding to frequencies f = c/A from 300 MHz (for À = 1 m) to 
300 GHz (for \ = 1 mm), and w = 27f from about 2 x 10°s~! to about 
2x101? s71.18 Such wavelengths cannot be transported to large distances 
with ordinary ac circuits, since, when the dimensions of the circuit be- 
come larger than the wavelength, the radiative losses become very large, 
and an alternative is provided by waveguides, i.e., hollow metallic pipes. 
Electromagnetic waves can propagate inside such waveguides, and the 
difference with respect to vacuum propagation comes from the boundary 
conditions that must be imposed on the fields on the surface of the pipe. 
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15.4.1 Maxwell’s equations in a waveguide 


Consider first Maxwell’s equations inside the waveguide. Since there are 
no sources, inside we just have the usual Maxwell’s equations in vacuum 
(we assume that a good vacuum is made inside the waveguide; otherwise 
one should simply include the corresponding dielectric constant), that 
we write again here 


VE = 0, (15.64) 
1 OE 
B-—— = 15. 
Vx 2 Ot 0, (15.65) 
VB = 0, (15.66) 
B 
VxE+ = = 0. (15.67) 


Observe that, for fields that are time-dependent, as electromagnetic 
waves, eq. (15.64) is implied by eq. (15.65). In fact, taking the divergence 
of eq. (15.65) and using V- (V x B) = 0, we obtain 0,(V-E) = 0, which, 
for a field with a time-dependence e~‘“! with w Æ 0, implies V-E = 0. 
Similarly, eq. (15.66) is implied by eq. (15.67). Therefore, for time- 
dependent fields in vacuum, it is sufficient to consider just eqs. (15.65) 
and (15.67).° 

We set the longitudinal direction of the waveguide along the z axis. 
The boundary conditions break the translation invariance in the (2, y) 
plane and give a non-trivial structure to the solution in the x and y 
directions, so we look for a solution in the form of a wave propagating 
along the z direction, with a generic dependence on the (a, y) variables, 


E(t,x) = 
B(t,x) = 


E(x ye res 
B(x ‘iene . 


(15.72) 
(15.73) 


On functions of this form, 0; > —iw and 0, — ik. Then, writing 
explicitly the equations in components, eq. (15.65) becomes 


iw ; 
ee =ikB, = —0,Bz, (15.74) 

iw : 
meu +ikB, = 0,B., (15.75) 
5:8) = O/B, = = E (15.76) 

c 

and similarly eq. (15.67) gives 
iwBs +ikEy = OyEz, (15.77) 
—iwBy +ikEs = ôrEz, (15.78) 
OrEy —OyEx = iwBy. (15.79) 


We now observe that eqs. (15.74) and (15.78) are two algebraic equations 
for the two variables €,, and B,, and similarly eqs. (15.75) and (15.77) 
are two algebraic equations for the two variables €,, and B,. The trans- 
verse components of the fields are therefore determined algebraically, in 


19Tndeed, it is instructive to see 
how the solution for electromag- 
netic waves in vacuum, that we 
found in eqs. (9.70)-(9.81) using all 
four Maxwell’s equations, could have 
been obtained using only eqs. (15.65) 
and (15.67): inserting the ansatz 
E(t, x) = Exe ttik and B(t, x) = 
B,e @tttk* (where, for the mo- 
ment, w and k are independent) into 
eqs. (15.65) and (15.67) we get 


kxBy + (w/c?)Ex =0, 
kx Ex, = wBk =0; 


(15.68) 
(15.69) 


Solving for By, in eq. (15.69), plugging 
it into eq. (15.68) and expanding the 
resulting triple vector product, we get 


(w? — kc?) Ey + k? (Ex k)k = 0. 
(15.70) 
Separating E, into its transverse and 
longitudinal parts, Ex = Ex, 1 + By, qk, 
we get 


(w? = kc?) Bx 1 +w’ Ep jk = 

. (15.71) 
Since E,,; and k are orthogonal, if 
w # 0 we get the two separate condi- 
tions w? — k?c? = 0 and Ex), = 0, i.e., 
the dispersion relation in vacuum, and 
the condition Ex, = 0 that, before, we 
had derived using V-E = 0. The same 
treatment can be made for the mag- 
netic field, solving for Ex in eq. (15.68) 
and plugging it into eq. (15.69). 
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terms of the derivatives of the longitudinal components €, and B,. The 
solution of these algebraic equations is 


YEs = ikO,E, + iwOyBz , 15.80) 
Ey = ikdyE, — iwd,B, , 15.81) 
YB, = ikdpB,— 0E, 15.82) 
C 
yB, = ikð,B, + —Oxts 15.83) 
where we have defined > 
= (=) — k, 15.84) 
C 


(not to be confused with the Lorentz boost factor, also conventionally 
denoted by y). Observing that 


2xV = 4x (Rô, +ý, +209) 
= os- óy, (15.85) 
we can rewrite this more compactly as 
Ei = ikOsE, — iw(% x V)ibBz, 15.86) 
7" Bi ikO;B, + Si x V)iEz, 15.87) 


where the index 7 run over the two values {x,y}. The remaining equa- 
tions to be satisfied are eqs. (15.76) and (15.79). Inserting eqs. (15.82) 
and (15.83) into eq. (15.76) we get 


(2+ 024+ YE.(z,y) =0, 15.88) 


which is a Helmholtz equation in two dimensions, of the type already 
encountered (in three dimensions) in eq. (10.14). Similarly, inserting 
eqs. (15.80) and (15.81) into eq. (15.79) we get 


(02 + Of + 77)B.(z,y) =0. (15.89) 


The problem is therefore reduced to solving eqs. (15.88) and (15.89). 
The solution for E, (x, y) and B, (x,y) will then determine all other com- 
ponents through eqs. (15.86) and (15.87) (as long as 77 Æ 0, see below). 
To solve eqs. (15.88) and (15.89), we must specify the boundary condi- 
tions for E, and B, on the boundary of the vacuum region that we have 
considered, i.e., on the boundary between the vacuum and the inner 
surface of the conductor that makes the hollow pipe. We discuss this in 
the next subsection. 


15.4.2 Boundary conditions at the surface of 
conductors 


To study the boundary conditions at the interface between vacuum and 
a conductor, we begin by observing that eqs. (13.64) and (13.68) are 
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valid even when the media (1) and (2) are the vacuum and a conductor, 
respectively, since they have been derived from the Maxwell’s equations 
(13.47) and (13.48), that do not depend on the sources. Therefore, 
the tangential components of E and the normal component of B are 
continuous across the surface separating the vacuum (or a dielectric) 
from a conductor. However, as we discussed in Section 4.1.6, in the 
absence of an external applied voltage a conductor quickly reaches an 
equilibrium situation where any external field is screened, so that inside 
a conductor there is no current flowing, and E = 0. In the absence 
of currents (and neglecting magnetic dipoles at the atomic scale, so 
excluding the case of ferromagnets) there is no magnetic field either, so 
inside a conductor, in a static situation, E = B = 0. Then, eqs. (13.64) 
and (13.68) tell us that, approaching the boundary from the vacuum 
side, the tangential components of E and the normal component of B 
vanish, so the boundary conditions for a waveguide are 


nxE=0, n-B=0. (15.90) 


To solve eqs. (15.88) and (15.89), we need to extract from this the bound- 
ary conditions on FE, and B,. Consider for instance a point on the 
boundary where the normal n is along the x direction (recall that the 
longitudinal axis of the waveguide has been set along the ĉ direction). 
Then eq. (15.90) gives, on the boundary, Ey = E, = 0 and B, = 0 and 
therefore 

&y=€=0, B, =0. (15.91) 


The boundary condition for E, is therefore €, = 0. For Bz, we use 
eqs. (15.81) and (15.82). Near the boundary, where €, = 0 and B, = 0, 
they become 


ikd,E, — iw0,B, = 0, (15.92) 
RoB. DE; =. if, (15.93) 

which can be combined to give 
nf ðB: = 0. (15.94) 


Therefore (unless y = 0), we have the boundary condition 0,B, = 0. 
More generally, for n generic rather than in the x direction, we get 
û VB, =0. 


15.4.3 TE, TM, and TEM modes 


In conclusion, we have to solve the eigenvalue equation 
with the boundary condition 


(E.)|s =0, (15.96) 
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and the eigenvalue equation 
—(62 + 8/)B. (2, y) = 7B, (2,9) ; (15.97) 

with the boundary condition 
n-(VB.)|5 =0, (15.98) 


where we have taken the z axis as the propagation direction and n is 
the normal to the boundary S' of the waveguide. These two equations 
and boundary conditions are independent of each other. The trans- 
verse components of the electric and magnetic field are then found from 
eqs. (15.86) and (15.87). We can distinguish three classes of solutions. 


TE modes. These are solutions with €, = 0, which trivially satisfies 
eq. (15.95) and the boundary condition (15.96), and B, # 0. Observe, 
from eq. (15.86), that, even if E£, = 0, as long as B, # 0, Ez and Ey are 
non-zero. In these solutions, therefore, the electric field is transverse, 
hence the name TE (for “transverse electric”), while the magnetic field 
has both transverse and longitudinal components. Equation (15.97) with 
the boundary condition (15.98) is an eigenvalue equation, that has so- 
lutions only for a discrete set of values of y?. From eq. (15.84), the 
corresponding dispersion relation is 


w? = (k +È, (15.99) 


where k = k, is a continuous variable, corresponding to the fact that we 
have assumed a straight infinite waveguide along the z direction, while 
7? plays the role of k2 + k? and takes discrete values because of the 
boundary conditions in the x and y directions, exactly as it happens for 
the vibration modes of a string. Since E, = 0, eqs. (15.86) and (15.87) 
simplify to 


yE; —iw(z x V);Bz, (15.100) 
y Bi = ikd;Bz. (15.101) 


II 


TM modes. These are the solutions with B, = 0 and E, 4 0, so 
now the magnetic field is transverse, while the electric field has both 
transverse and longitudinal components. From eqs. (15.86) and (15.87), 
the transverse component are given by 


aE; = ikôðiEz, (15.102) 
YB = S(@x V)iEx. (15.103) 
C 


The dispersion relation is again eq. (15.99). However, the eigenvalues 
~? are, in general, different from those of the TE modes, since y? is 
determined by eq. (15.97) with the boundary condition (15.98), which 
is different from the boundary conditions for the TE modes. 


TEM modes. Finally, one can search for a solution transverse both in 
the electric and magnetic fields, E, = B, = 0. From eqs. (15.86) and 
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(15.87), we see that a non-vanishing solution for the transverse fields is 
possible only if y? = 0, so the dispersion relation becomes 


w= ke, (15.104) 


as in vacuum. In this case, eqs. (15.86) and (15.87) become trivial iden- 
tities 0 = 0 and do not allow us to determine the transverse component. 
To determine them, we must rather go back to Maxwell’s equations. 
Consider first eqs. (15.65) and (15.67), in the form (15.74)—(15.79). Set- 
ting €, = B, = 0 and w = kc, eqs. (15.74) and (15.75) give 


Bi = =), cBy = Ex, (15.105) 


and the same conditions are obtained from eqs. (15.76) and (15.77). 
Equations (15.78) and (15.79) give 


OrEy — yEx=0, OrBy — A,B, =0. (15.106) 


The remaining equations are the divergence equations (15.64) and (15.66) 
that, setting E€; = B, = 0, become 


OrEx + OyEy =0, Be + OyBy =0. (15.107) 


We therefore have two two-dimensional fields E(x, y) and B(x, y) that 
satisfy 
Vr-£=0, Vrx£=0, (15.108) 


where Vr = X0,+yY0y is the two-dimensional gradient in the transverse 
plane, and similarly Vr-B = Vr x B = 0. A theorem states that, if 
a two-dimensional vector field E satisfies Wr-E = Vr x E =O ina 
simply connected domain, then E = 0. However, this is no longer true 
if the domain is not simply connected (i.e., if there are closed curves 
that cannot be deformed continuously to a point). So, for instance, 
we can take as waveguide a coaxial cable, made of an inner cylindrical 
conductor whose radius, in the transverse plane, is rı, and an outer 
hollow conductor of radius rə > rı, so, in the transverse plane, the 
waveguide consists of the region r? < x? +y? < r. In this case, it is 
possible to have non-zero solutions for Ez, Ey (and, similarly, for B,, By), 
with €, = B, = 0, corresponding to modes in which both the electric 
and the magnetic fields are transverse, called TEM modes. 


Example: rectangular waveguide. As an example, consider a waveg- 
uide whose transverse section is a rectangle 0 < x < a, 0 < y < b. Being 
simply connected, TEM modes do not exist. The TE modes are the 
solution of eq. (15.97), with the boundary conditions 


gee (c=0,y) = we (x =a,y)=0, (15.109) 
Ox Ox 
OB, (x,y=0) = abs (x,y = b) =0. (15.110) 
Oy Oy 


The eigenfunctions are characterized by two integers (m,n), and are 


(Bz) mn = Bmn cos (==) cos (Fo) ; (15.111) 


400 Electromagnetic waves in material media 


with Bmn constant, and the corresponding eigenvalues are 


Yan = (m=) + (=) . (15.112) 
a b 
The corresponding modes are denoted as TEmn. The TE and TM modes 
are solutions only if 4? 4 0, since otherwise eqs. (15.80)-(15.83) degen- 
erate to 0 = 0 identities and one must rather resort to Maxwell equation 
in the original form; as we have seen, this gives the condition for the 
TEM modes, which are absent in a rectangular waveguide. Therefore, 
the case m = n = 0 is excluded, and, taking a < b, the lowest TE mode 
is the mode TEpo}. 

Similarly, the TM modes are the solutions of eq. (15.95) with €, van- 
ishing on the boundaries, 


E(x = 0,y) E,(x =a,y) =0, (15.113) 
Elz, y=0) = Ez, y =b) =0. (15.114) 
The solutions are 
. (MT \ . (NT 
(Ez) nn (2Y) = Emn sin (2) sin (+s) . (15.115) 


The corresponding eigenvalues are again given by eq. (15.112), but now 
m > l and n > 1, since otherwise the solution vanishes. Hence, the 
lowest TM mode is the mode TM}. 

From the dispersion relation 


Wn HBO +B inl (15.116) 


it follows that, in the waveguide, the minimum frequency that can prop- 
agate is w = cyo1 = cm/b, and therefore the wavelength A = 2rc/w 
must be less than 2b. Since k is a continuous variable, all frequencies 
above this limiting value, i.e., all wavelength A < 2b can propagate in 
the waveguide. 


Exercise. Compute the electric and magnetic fields in the transverse 
plane for the TEo, and TM); modes in a rectangular waveguide. 


Scattering of 
electromagnetic radiation 


In this chapter we discuss the scattering of electromagnetic waves by 
charged particles. We will examine separately the scattering from free 
electrons and that from electrons bound in atoms or molecules. 


16.1 Scattering cross-section 


We begin by defining the scattering cross-section, which provides a quan- 
titative way of describing the outcome of a scattering process. The 
cross-section is defined, at a fundamental level, with reference to colli- 
sions between particles. Consider a beam of particles of type 1, with 
number density nı and velocity vı, impinging on a target made of par- 
ticles of type 2 and number density n2, in the frame where the particles 
of type 2 are at rest. This could be the case, for instance, of a beam 
of electron impinging on a block of material. The number of scattering 
events, dN, that take place in a volume dV of the material in a time 
interval dt must be proportional to the incoming flux nıvı and to the 
density of targets n2. The proportionality constant is, by definition, the 
cross-section o, 


dN = ovinıno dV dt. (16.1) 


Dimensional analysis shows immediately that o has the dimensions of an 
area. At a more detailed level, we can consider the number (dN/dQ)dQ 
of particles scattered within an infinitesimal solid angle dQ = dcos 6d@ 
centered around a given direction, identified by polar angles (0, ¢), and 
define the differential cross-section da /dQ by 


dN do(Q@, 
a0 = wo.) vinno dV dt, (16.2) 


so that the total cross-section ø is obtained from 


do(8, $) 
o= fa a (16.3) 


Consider a monochromatic beam of particles, i.e., a beam where all 
incoming particles have the same energy Æ. Then Env is the incoming 
energy flux. Consider first for simplicity elastic scattering, where the 
incoming particle is scattered from a fixed center, so that its final energy 
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is the same as the initial energy. Then, multiplying by E and integrating 
over the volume of the target, eq. (16.1) gives 


EdN = o Emvidt f ns dV, (16.4) 


where, with a slight abuse of notation, we keep the same symbol dN 
for what is now the number of scattering events differential with re- 
spect to dt (rather than with respect to dtdV). We now observe that 
EdN/dt = d(EN)/dt is the scattered energy per unit time, i.e., the scat- 
tered power, while Env is the incoming energy flux, and Nz = f n2dV 
is the number of targets. Therefore, the cross-section is equal to the ratio 
of the scattered power P to the incident energy flux J, per unit target. 
Therefore, if we consider a wave impinging on a single target, 


P 
=— 16. 
c= 5; (16.5) 
or, more generally, 
do(0,~) _ 1 dP(6,¢) 
qa Tr a” ee) 


We can take this as the definition of the classical cross-section for the 
elastic scattering of an electromagnetic wave impinging on a single target 
(the underlying reason being that, at a fundamental level, an electro- 
magnetic wave can be seen as a collection of particles, the photons). 
Elastic scattering in this case corresponds to the fact that the frequency 
w of the scattered electromagnetic wave is the same as the frequency win 
of the incoming radiation (due to the fact that the energy of a photon is 
related to its frequency by the quantum-mechanical relation E = hw). 

More generally, the incoming energy could be partly absorbed and 
dissipated into heat in the material, and partly re-radiated, not neces- 
sarily at the same frequency as that of the incoming wave. The radiated 
power will therefore have a frequency spectrum, 


dP 
P= f we), (16.7) 
dw 
and correspondingly we can define the scattering cross-section, differen- 
tial with respect to frequency, do /dw, 
dd scatt (w) oa 1 dP(w) 

dw I d ’ 
as well as the cross-section differential both with respect to frequency 
and to solid angle, 


(16.8) 


doscatt(w;9,¢) 1 dP(w;0,¢) 
dwdQ I dwdQ 
The absorption part, in contrast, will be determined by the term j-E in 
eq. (3.35). If we denote by Paps the energy absorbed per unit time by 
the material, we can similarly define an absorption cross-section from 
Paps 
T` 


(16.9) 


(16.10) 


Oabs = 
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For the absorbed energy there is no quantity corresponding to the angles 
(0, p) of the emitted radiation, and therefore no sense in which we can 
consider the differential with respect to the solid angle, as in eq. (16.6). 
However, we can still consider the power absorbed per unit frequency, 
and therefore the frequency spectrum 


doaps(w) 1 Pabs(w) 
dw I dw ` 


For an incoming electromagnetic wave, the incident energy flux is given 
by the Poynting vector (9.59). 


(16.11) 


16.2 Scattering on a free electron 


We now consider an electromagnetic wave impinging on a free electron 
initially at rest. The electromagnetic field of the wave accelerates the 
electron, that therefore emits radiation in all directions. In the classical 
picture, this is the origin of the scattered wave. The action of the wave 
on the electron is determined by the Lorentz force (3.6). We assume that 
the electric field of the wave is not too large, so that the motion induced 
on the electron remains non-relativistic; we can then neglect the term 
v x B compared to E, since in an electromagnetic wave |B| = |E|/c, and 
v/c <1. In this non-relativistic limit, the acceleration of the electron 
is then given by 

meX(t) = —eE[t,x(t)], (16.12) 


where x(t) is the electron position. To understand the limit of validity 
of this approximation, consider for instance a field E(t, x) = EX cos(wt) 
that does not change appreciably in space in the region over which 
the electron performs its oscillatory motion. Then the integration of 
eq. (16.12), with initial condition z(t = 0) = 0 and «(t = 0) = 0, gives 


Hija ee, 2 eta. 8 


MeW Mew? 


The assumption of non-relativistic motion of the electron is therefore 
satisfied if the electric field is such that 


eE 


Mew 


<c. (16.14) 


For instance, for visible light with A = 500 nm, this limiting field cor- 
responds to 6 x 10/2 V/m, which is a quite large field that can only be 
obtained with picosecond laser pulses. In turn, this implies that 

eE C 1 


= FF 16.15 
“0 Mew? s w k ( ) 


This means that the amplitude xo of the oscillations is such that kxo < 
1. It is therefore consistent to set e*** ~ 1 in the expression (9.49) of 
the electric field of the propagating wave, and write simply 


E(t) = Re [@(k) Exe] . (16.16) 
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We already met this combination in 
Note 8 on page 106 in the context 
of the electron self-energy, as well as 
in eq. (12.167), in the context of the 
Abraham—Lorentz equation. 


The dipole moment of the electron is d = —ex, so eq. (16.12) gives 
ig 22 (16.17) 
soar : 


Consider for instance an incoming electromagnetic wave linearly polar- 

ized along the ê direction, so that E = Fé. The radiated power per unit 

solid angle by a non-relativistic electron is given by Larmor’s formula 
(10.144), 

dP(t; 0) 1 1 e*E? sin? 0 

dQ A4neg 4rc3 f 


(16.18) 


where @ is the polar angle measured from the ê axis. The incident energy 
flux I = |S| of the incoming wave is given by eq. (10.140), which, using 
Ho = 1/(e9c”), can be rewritten as 


I=cegE”. (16.19) 
Therefore, 
= = rosin" ð, (16.20) 
where we have defined the classical electron radius! 
1 e? 
ro= Rey te (16.21) 


Integrating over the angles as in eq. (10.145) we get the Thomson cross- 
section or, 


87 a 
Op = > Fo 


3 (16.22) 


Notice that the frequency of the radiated electromagnetic wave is the 
same as the frequency of the incoming wave: an incoming wave oscil- 
lating at the frequency w induces an oscillatory motion of the free elec- 
tron again at the frequency w, see eq. (16.13). In turn, as we see from 
eq. (10.131), this motion generates dipole radiation at the frequency w. 
Therefore, the incident and scattered waves have the same frequency. 
It should be observed that, in quantum mechanics, this result is mod- 
ified at sufficiently large frequencies, and the frequency of the scattered 
wave is lower than that of the incoming wave. This is a simple conse- 
quence of energy-momentum conservation, once we borrow from quan- 
tum mechanics the information that, at the quantum level, an electro- 
magnetic wave with frequency w is a collection of particles, the photons, 
with energy E = hw and momentum k = h(w/c)k. Consider the scatter- 
ing process of a photon on an electron initially at rest. Take for instance 
the (x,y) plane as the scattering plane, with the photon coming along 
the x axis, and denote by 0 the scattering angle of the final photon in 
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the (x,y) plane, see Fig. 16.1. Then, the initial four momentum of the 
photon is k” = (E£,,k) with 


hi 
E,=hw, k=” 


6 


(1, 0,0) , (16.23) 
and the initial four-momentum of the electron is p” = (Ee, p) with 


Ee =M, p=0. (16.24) 
The final four-momentum of the photon is k” = (E}, k’), where 
E 


hw 
E, = hw’, k’ = — (cos6, sin 0,0), (16.25) 
c 


corresponding to a scattering angle @ in the (x,y) plane, and a generic 
frequency w’ that we will determine using energy-momentum conserva- 
tion. The final four-momentum of the electron is p'” = (£1, p’), with 


; Fig. 16.1 The geometry of the 
E = Vy mĉ2ct +|p'|?c?, p= |p’|(cos y,siny,0), (16.26) Compton scattering described in the 
text. 
where we denoted by % the angle at which the electron recoils with 
respect to the direction of the incoming photon, see again Fig. 16.1. 
From energy conservation we have E? = (Ee + Ey — E‘)?, which gives 


[pP E = Rw- ww’)? + 2m. Aw — Ww’), (16.27) 


while momentum conservation along the x and y axes gives, respectively 


ħw hw’ cos 0 + c|p'| cosy, (16.28) 
0 = ñwsinb + c|p'|siny, (16.29) 


We therefore have three equations for three variables w’, |p’|, and 4, for 
a given scattering angle 0. Solving the equations, we get 
/ h f 
w—w = ww’ (1 — cos 8). (16.30) 


Mec? 


This shows that w’ < w for all scattering angles, with the equality satis- 
fied only for forward scattering, 0 = 0. This is due to the fact that, be- 
cause of the momentum carried by the photon, the electron recoils in the 
scattering process (except for 0 = 0, where the photon is re-emitted in 
the same direction), and therefore acquires some kinetic energy. There- 
fore, part of the energy of the initial photon is transferred to the elec- 
tron, and the final photon has a smaller energy. Equation (16.30) can 
be written in a slightly more elegant form in terms of the wavelengths 
à = 2rc/w and X = 27c/w", as 


7 Qh 


MeC 


NX 


(1 — cos 0). (16.31) 
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This shift in the wavelength is known as the Compton effect, and the 
length-scale 
h 


MeC 


rc= (16.32) 
is called the Compton radius of the electron. We see that the effect 
becomes important when the reduced wavelength A = A/(27) of the 
incoming radiation is of order of ro. Numerically, ro ~ 4 x 1071! cm, 
and the corresponding wavelengths are in the domain of X rays. 

Therefore, the classical Thomson scattering formula is valid for fre- 
quencies small (i.e., wavelengths large) compared to those of X rays. 
When w becomes of order c/rc, i.e., when the energy ħw of the incom- 
ing photon becomes of order of the rest energy of the electron m,c?, 
the classical formula is no longer valid. Historically, the Compton ex- 
periment of scattering of X rays on electrons, that was first carried out 
between 1919 and 1922, was crucial to show that, at the quantum level, 
light can behave as particles, leading to the concept of particle-wave 
duality. 


16.3 Scattering on a bound electron 


We now consider the scattering of an electromagnetic wave on an electron 
bound in an atom or in a molecule. A complete description requires 
quantum mechanics, taking into account the discrete energy level of 
the bound electron. Furthermore, just as for the scattering on a free 
electron, if the frequency of the electromagnetic wave is in the X-ray 
regime or larger, we should also take into account the quantum nature of 
the electromagnetic field, which eventually requires the use of quantum 
field theory. Here we discuss the purely classical computation, in which 
both the electron and the electromagnetic field are treated classically. 
Within such a classical approach, a simple description can be obtained 
by modeling the bound electron as a damped harmonic oscillator with 
natural frequency wo and damping constant yo, forced by the external 
field due to a monochromatic electromagnetic wave with frequency w, 
similarly to what we already did in Section 14.3 when we developed 
the Drude—Lorentz model for the dielectric constant. As in the case of 
the free electron, we assume that the amplitude E = |E| of the electric 
field satisfies eq. (16.14), so that the motion of the electron under the 
action of the electromagnetic wave remains non-relativistic. We can then 
write a Newtonian equation of motion for the electron, using the Lorentz 
force, and we can neglect the effect of the magnetic field. We have also 
seen that, for electric fields that satisfy eq. (16.14), the amplitude of 
the oscillations is such that kao < 1 (for a free electron, and therefore 
even more so for a bound electron). We can then approximate e’** ~ 1 
in the expression (9.49) of the electric field of the wave, which is then 
given simply by eq. (16.16). The equation of motion of the electron, 
with position x(t), in the presence of the external field E(t) of the 
electromagnetic wave, is therefore the one that was already written in 
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eq. (14.33), with E(t) given by eq. (16.16), 


žo + YoKo + w2xo = — Re [6(k) Ey e~™*] . (16.33) 
We take the propagation direction of the electromagnetic wave along the 
x axis, k = å, and we consider a linearly polarized wave with ê(k) = z 
and Ey, = Eo real. Then, eq. (16.33) becomes 


Zo(t) + yo40(t) + we 20(t) = — = Re [e+] . (16.34) 
Searching the solution in the form 
z(t) = Re [Z(w)e™*] , (16.35) 
we get cE n 
O ae z’ (16.36) 
SO, a ee 
z(t) = “ Re -= a -l . (16.37) 


The z component of the dipole moment is given by d; (t) = —ez(t), and 
then 


i (t) D e? Eqw? l | e wt 


Me w? + iwy — we 
(<2 ) (w? — we) cos wt — wyo sin wt 


16.38 
aa o O 


Me 


Note that the dipole oscillates at the frequency w of the incoming wave 
so, again, classically the scattered wave has the same frequency as the in- 
coming wave. Using the Larmor formula (10.144), the radiated power is 
obtained from [d;(t)]?. Rather than the power radiated instantaneously, 
it is convenient to consider the power averaged over one period of os- 
cillation, which better corresponds to what can be actually measured. 
When we average over one period T = 27/w we get 


1 sT 
(cos?(wt)) = zi dt cos? (wt) 
T Jo 
1 2T 
= — d A 
=f COs” a 
ae (16.39) 
= g 
Similarly (sin?(wt)) = 1/2, while (sin(wt) cos(wt)) = 0. Therefore, 
-e 1 e Ey E wt 
d,(t)|?) = : 16.40 
(don =5 (2) aoe (16.40) 


and the average radiated power, per unit solid angle, is 


2B 2 4 
Oe) nf = sin’, (16.41) 
dQ Areo BT \ me (w? — we)? + w292 
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w/wo 


Fig. 16.2 The cross-section (w), 
normalized to the Thompson cross- 
section or, as a function of w/wo, 
for yo/wo = 0.4. 


where @ is the polar angle measured from the z axis. Since we have 
averaged the power over one cycle, we also average the incident flux 
over one cycle, and we extend the definition (16.5) to o = (P)/(1). 
Equation (9.59) gives 


1 
(I(t)) = zeocE - (16.42) 
The differential scattering cross-section is therefore 
do scatt (w; 0) 2 wt - 2 
= 0 16.43 
dQ ro (w? — we)? + w292 ns ( ) 


where we have used the definition (16.21) of the classical electron radius, 
and we wrote explicitly as an argument the frequency of the incident 
wave. If we set wo = Yo = 0, we recover the result for a free electron, 
eq. (16.20), as it should be. Integrating over the angles, we find the total 
scattering cross-section 


(16.44) 


A plot of the cross-section (16.44) is shown in Fig. 16.2, for yo/wo = 0.4. 
Several aspects of this result are noteworthy. At low frequencies, w < 
wo, we have 


8 4 
Oscatt (w) = Er (2) 5 (w < wo) > (16.45) 


At low frequencies, the cross-section is therefore proportional to w4. 
In contrast, when the frequency w of the electromagnetic matches the 
natural frequency wo of the oscillator, we are in the resonance condition 
and 


Cscatt (W = wo) = —r (2) (16.46) 


For yo < wo, the cross-section is therefore greatly enhanced with respect 
to the Thompson cross-section op = (87/3)r3. Close to the resonance 
we can approximate 


87 45 wt 
Tl = 370 aP UF ae 
_ ST 2 Ww 
© 3 (ww)? (Quo)? + EB 
= 2 w (16.47) 


From this expression we see that the resonance condition is maintained 
in a narrow range of frequencies Aw ~ 7/2 centered around wo. The 
peak at w = wo is an example or resonant scattering, or of a resonance. 


16.3 


Finally, at large frequencies, i.e., when w >> wo and w > yo, the cross- 
section goes to the Thompson cross-section or = 8rr2/3, corresponding 
to the limit of scattering on a free particle. 

As we have discussed in the free particle case, at sufficiently high 
frequencies, when Aw becomes of order mec?, the classical computation 
is no longer valid. A full relativistic quantum computation shows that, 
beyond that value, o(w) decreases as log(w) /w. 

A beautiful consequence of this computation is that it allows us to ob- 
tain at least a first understanding (due to Lord Rayleigh, 1871) of why 
the sky is blue, and why the Sun at sunset looks redder than at noon. 
Light in the visible spectrum has a frequency that is not high enough 
to excite electronic transitions in the molecules, so we are in the regime 
w X wo, where wo is the natural resonance frequency of a molecule. At 
the same time, w is much larger than the typical vibrational frequencies 
of the molecules, which means that the vibrational modes of the nuclei in 
the molecules cannot follow the fast oscillations of the electromagnetic 
field, so these modes are not excited, and can be neglected. Another 
important aspect is that air is sufficiently diluted, with respect to the 
wavelength of visible light, that the molecules scatter light independently 
of each other and the computation performed previously, where we con- 
sidered the scattering from a single charge, applies. The opposite limit, 
in which a whole ensemble of charges is set into oscillation coherently 
by the incoming electromagnetic wave, would have required a different 
computation. So, for scattering of visible light by the atmosphere we 
are in the situation where eq. (16.45) holds. Compare for instance the 
scattering cross-section for light at A = 450 nm, in the blue region of 
the visible spectrum, to that of light at A = 650 nm (red). We have 


Whlue . Ared i 
( ) = ( ) ~ 4.35, (16.48) 

Wred Ablue 
so the cross-section for scattering of blue light is about four times larger 
than for red light. When we look at the sky, in a direction different 
from that of the Sun, our eyes receive light that was emitted by the 
Sun and then scattered toward us by some molecules in the atmosphere. 
Since blue light is preferentially scattered toward us, we see a blue color. 
Conversely, at sunset, light must travel through a longer path in the at- 
mosphere, compared to the path when the Sun is at the Zenith. Since, 
traveling through the atmosphere, blue photons are preferentially re- 
moved, we see a redder Sun. Sunset seen from astronauts orbiting the 
Earth is even redder because the path length through the atmosphere is 
doubled.? 
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See e.g., https://www.esa.int/ 
ESA_Multimedia/Images/2010/02/ 
Scattered_sunlight. 


Electrodynamics in 
Gaussian units 


In this appendix we translate the main results and equations of the text into 
Gaussian units. The conceptual relation between SI and Gaussian units has 
been discussed in detail in Section 2.2, where we have seen that the crucial 
difference is that, in the SI system, the unit of current (or the unit of charge) 
is a fourth independent base unit, with respect to the units of length, time, 
and mass while, in the Gaussian system, it is a derived unit. As we dis- 
cussed in Section 2.2, there are actually two variants of the Gaussian system, 
unrationalized Gaussian units (often referred simply as Gaussian units) and 
“rationalized Gaussian” (or Heaviside—Lorentz) units. Rationalized Gaussian 
units are the most common (if not universal) choice in quantum field theory. 
In this context, one normally sets also c = 1 (and even h = 1, but h does not 
appear in our classical equations). As we saw in Section 2.2, the conversion 
from SI to rationalized Gaussian units is performed with the formal replace- 
ments given in eq. (2.46). If we furthermore set c = 1, the conversion from 
SI units to rationalized Gaussian units is therefore trivially performed, simply 
by making in the SI equations the formal replacements 


fo 71, wool, col. (A.1) 


We will therefore not consider rationalized Gaussian units further; unrational- 
ized Gaussian units, in contrast, are often used as an alternative to the SI in a 
classical context, therefore keeping c explicit. Furthermore, there are extra 47 
factors, so the conversion is less trivial, and in this appendix we write explic- 
itly most of the main formulas of the main text in (unrationalized) Gaussian 
units. 

As we have seen, the actual relation between charges and fields in Gaussian 
and SI system is given by eqs. (2.25), (2.32), and (2.33), that we recall here 


en A.2) 


4 
Egau = Vino Esr, Beau = 4/ — Bsr. A.3) 
Ho 


In the Gaussian system, the gauge potentials are introduced from 
Bzgau = Vx Agau; A.4) 


Egau = V beau : OAs . A.5) 
Comparing with the definition of the gauge potentials in the SI system, given 
in eqs. (3.80) and (3.83), and using eq. (A.3), as well as eoo = 1/c*, we get 
the following relation for the gauge potentials, 


4 
dbgau = vy 4TEo ost ; A gau — : Asi : (A.6) 


I This appendix can also be used as a 
compact summary of the main equa- 
tions in the book; they are written here 
in Gaussian units, but always with a 
reference to the equation number of the 
corresponding SI equation in the main 
text. This allows the reader to quickly 
find also the desired SI equation. 
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2 The inverse path, from Gaussian to SI 
equations, is less straightforward. Once 
we set €9 to a dimensionless value, to re- 
store it back requires an input from di- 
mensional analysis. To go from Gaus- 
sian to SI it is therefore better to use 
the actual relations (A.2, A.3). 


3See, however, eq. (A.99) for the defini- 
tion of A“ in terms of ¢ and A in Gaus- 
sian units, and eq. (A.86) for the defini- 
tion of the magnetic dipole in Gaussian 
units. 


We have also seen in eq. (2.38) that, independently of the actual relation 
between quantities in these systems, expressed by the equations given above, 
a quicker way of passing from the equations in the SI system to those in 
Gaussian units is to perform the replacements 

E-E 


€9 > A i (A.7) 


B 
Bo = 
AT Ta 


(without changing p and j), since this formally transforms Maxwell’s equations 
in the SI system into Maxwell’s equations in the Gaussian system.” For the 
gauge potentials, the corresponding formal replacement are? 

o>, A> “ ‘ (A.8) 
In this way, all the equations that we have written in the book can be imme- 
diately translated to Gaussian units. In any case, we find it useful to collect 
together some of the most important results, as a quick reference. We orga- 
nize this collection of results according to the chapters of the main text where 
they have been first given in SI units. We henceforth drop the subscript “gau” 
from all quantities. It is understood that all the formulas in the rest of this 
Appendix are written in (unrationalized) Gaussian units. 


Chapter 3 


In Gaussian units, Maxwell’s equations (3.1)—(3.4) read 


VE = 4rp, (A.9) 
10E AT. 

Veo a as (A.10) 
VB = 0, (A.11) 
1 B 


= m(t), 
AT i i 1 d® z(t) 


) (A.13) 
$. dt-B(t,x) = Tro- (A.14) 
) (A.15) 
) (A.16) 


= 0, 
_1dðs(t) 


f dé-E(t,x) = 
as e 
compare with eqs. (3.12), (3.19), (3.16), and (3.21), where we still have 


dt ’ 


rlt) = [a Bit), (A.17) 

z(t) = [a Blew) (A.18) 
and 

avi) = | rota), (A.19) 

ity. = Í ds-j(t,x) (A.20) 


In a non-relativistic setting, where the notion of force makes sense, the Lorentz 
force equation (3.5) becomes 


F=q(B+~xB), (A.21) 
or, in a fully relativistic setting, 

dp v 

P —q(E+~ xB). (A.22) 


compare with eq. (3.6). The continuity equation for the electric charge still 


keeps the form (3.22), 
p ; 
j= A.2 
DE +V-j=0, (A.23) 


while the Poynting vector and energy conservation, that in SI units are given 
by eqs. (3.34) and (3.35), become, respectively, 


c 
S = — (E x B), A.24 
Ż(ExB), (A.24) 
and F ER 
= dE.) cB j=- | ds-S. (A.25) 
dt Jy 81 vV av 
Therefore, the energy density of the electromagnetic field, eq. (3.43), becomes 
utj = HE +B?). (A.26) 
T 
The momentum density is 
1 
8 = 2s 
1 
= ——(ExB A.2 
1- (ŒxB), (A.27) 


compare with eqs. (3.56) and (3.57), so the electromagnetic field enclosed in 
a volume V carries a momentum 


Pnt) = | Erse 


= Ż | Pele x ness, (A.28) 
4re Jy 


compare with eq. (3.67). The Maxwell stress tensor (3.64) becomes 


1 fl 
Ty = z pe + B’)6:; — EE; B:B,| l (A.29) 


while the time derivative of the mechanical momentum associated with a 
charge and current density in an electromagnetic field, eq. (3.68), becomes 


TP mech L [ee (æ+ 1ixB) (A.30) 


dt 
The angular momentum of the electromagnetic field is 


l 3 
Jem = ia d°«xx(E x B), (A.31) 


to be compared with eq. (3.76), while eq. (3.79) becomes 


AS mech | dexx (æ+ LxB) . (A.32) 
dt v c 
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The gauge potentials are introduced from 


B = VxA, (A.33) 
1 ðA 
c ôt’ 
compare with eqs. (3.80) and (3.83). In terms of them, the two Maxwell’s 
equations that do not depend on the sources are automatically satisfied, while 
those that depend on the source, that in the SI system are given by eqs. (3.84) 
and (3.85), become 


E = Vo 


2 = a 
Vogt cal (V-A) = —4rp, (A.35) 
r 1 8A _10¢\ _ 4r, 
VA 2 OE Vi V-A4 = OL zJ. (A.36) 
The gauge transformation (3.86) reads 
A>A' = A-VO, (A.37) 
j 100 
= f ; A. 
oad = 6425 (A.38) 
The Lorenz gauge (3.89) is now defined from 
10¢ 
Aba A. 
TArt py =0, (A.39) 
and, in this gauge, eqs. (A.35) and (A.36) become 
o = —4rp, (A.40) 
A = = a (A.41) 


compare with eqs. (3.90) and (3.91). The Coulomb gauge still reads V-A = 0 
and, in this gauge, eqs. (A.35) and (A.36) become 


V? = —A4np, (A.42) 
o 4r., 1 Oo 
A cia zV BE? (A.43) 


compare with eqs. (3.93) and (3.94). 


Chapter 4 
Poisson’s equation (4.3) becomes 
Vb =—A4np. (A.44) 
The solution can be written in terms of the Green’s functions of the Laplacian 
as 
(x) = -4r f x! G(x — x')p(x’), (A.45) 
and, given that the Green’s function of the Laplacian is still given by eq. (4.15), 
we get , 
= a f p(x ) AA 
(x) = fax! PEE, (A.46) 
to be compared with eq. (4.16). The corresponding electric field is 
/ 
-e 307 h X-X 
E(x) = fa x p(x eae! (A.47) 


to be compared with eq. (4.20) and, for a point charge, the Coulomb force 


reads dig 
192 » 
[as we already wrote in eq. (2.23)], to be compared with eq. (2.6). For the 


electric field on the surface of a conductor, eq. (4.54) becomes 


n-E = 470, (A.49) 
while eq. (4.60) becomes 
dpa 1 2 A 1 2a 
= — d E-n)E— -E i A. 
A TR s k h) z ô (A.50) 


The basic equations of magnetostatics are 


vVxB = 2j, (A.51) 
VB = 0, (A.52) 


compare with eqs. (4.67) and (4.68), and the integrated forms are 


f, de- Box > = 7, (A.53) 
c E 


fia -B(t, x) 


compare with eqs. (4.70) and (4.71). The magnetic field generated by an 
infinite straight wire, eq. (4.79), now reads 


0, (A.54) 


21I 
B(p, y, z) = — È. A.55 
(p, 9; 2) ~ (A.55) 
Equation (4.91) becomes 7 
WA=-—j, (A.56) 
c 
whose solution is g i(x’) 
= 3 7 JX 
A(x) = z fa x ee] (A.57) 
compare with eq. (4.92). The corresponding solution for B is 
1 31 J(x’)x(x — x’) 
B(x) = : fa x kep (A.58) 
compare with eq. (4.95). For a thin wire these become, respectively, 
I f 1 
A(x) = dé : A.59 
Mo fe” ie xO ae 
and i W) 
x—x 
B(x) = d£ A.60 
=i dl en (A.60) 
compare with eqs. (4.104) and (4.105). Equations (4.109) and (4.110) become, 
respectively, 
F=} $ dexB, (A.61) 
C Ja 
and 


F= L f aiB). (A.62) 
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The force between two parallel wires, eq. (4.112), becomes 


dF, 2hh , 
_ A. 
7 “aq Po (A.63) 


while eqs. (4.118) and (4.119) become, respectively, 


1 R . x—xX 
Fi — -5 | Pedr iit) jx xf > (A.64) 
and 
IIo f x(£) — x(L’) 
Fi =- dé, -dé. —~—_-_~_ A.65 
E Je, Jo, OO RO AUNT An 
Equation (4.134) becomes 
1 dg 
ode ~~ fy d (E+ EB) (A.66) 


Chapter 5 


The electrostatic energy of a static system of point charges, eq. (5.7), becomes 


(Ez)p TI za 7: (A.67) 


a=] ba 


The electrostatic potential felt by the a-th point charge because of the inter- 
action with all other charges is 


ga A. 
Pa (X1; -5 mm (A.68) 
compare with eq. (5.8), so eq. (A.67) can also be written as 


d,s (A.69) 


so eq. (5.9) is unchanged. Eqs. (5.43)—(5.45), valid for a set of conductors, 
also stay unchanged. For a continuous charge distribution 


in 


ee. = : f dz p(x)e(x), (A.70) 
_ 3 nd? 1 P(x) e(x J 
= sfa Pa ERE, (A.71) 


compare with eqs. (5.15) and (5.16). The corresponding expressions for the 
magnetic energy are 


Ep = z | Pxie0-a (x) (A.72) 


— 5 aa fs Ër ,j(x J: i) (A.73) 


[x — x| 


compare with eqs. (5.52) and (5.53). For a set of loops, eq. (5.61) becomes 


1 N 
= 5 D IaB a. (A.74) 


The mutual inductances and self-inductances are now defined from 


N 
1 

Spa = Leelys A. 
a8: X oly (A.75) 


b=1 


to be compared with eq. (5.63), so that we still have 


N 
1 
=5 XO Lavlalo, (A.76) 


as in eq. (5.67). The flux through loop a of the magnetic field Bz generated 
by loop 6 is still given by eq. (5.64), 


(®e)ab = [we 


a 


= dla: As[x(la)] - (A.77) 
Ca 


Inserting here the expression for A, obtained from eq. (A.59), 


(A.78) 


-2g ba x dey (A.79) 
Cy ZX [x(la) — x(%)] 
and therefore eq. (A.75) gives 


= -dl 
bat = bm Ea (A.80) 


to be compared with eq. (5.66). 


we get 


Chapter 6 


The expansion in electric multipoles is obtained ahs 


6) =1 ferh (F) as 


compare with eq. (6.5). The electric dipole moment is still defined as in 
eq. (6.8). Then, the electric dipole term of the potential is 


d-n 
Pdipole(X) = "pe? (A.82) 
and the corresponding electric field is [compare with eq. (6.13)] 
3(d-n)n—d 
Eaipole = ee ` (A.83) 


The expansion of the scalar potential up to the electric quadruple term is 


o(x) = 24 nis Ruse (A.84) 


r r2 
where the quadrupole Qj; is still defined by eq. (6.18). The expansion in 
magnetic multipoles is obtained expanding eq. (A.57) as 


A(x) = — [ae 'j(x h+? A *+0(4)| (A.85) 
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4The extra factor of c in eq. (A.86) is 
indeed inserted so that there is no ex- 
plicit factor of c in eq. (A.96). 


The magnetic dipole moment is now defined as 


_1i [3 ‘ 
m= ai x xXXj(x), (A.86) 


with an extra factor 1/c compared to eq. (6.36). Then, for a closed loop of 
current, 


m= xh xxde, (A.87) 
2c Je 
and for a closed planar loop, 
IA 
m= —nfh, (A.88) 
c 

while, for a charged particle, 

m= L, (A.89) 
2Mac , 


compare with eqs. (6.41)—(6.43). With this definition of magnetic moment, 
the vector potential and magnetic field generated by a magnetic dipole are, 


respectively, 


mxx 
A dipole (x) = 


oo (A.90) 
and 5 — 
Baipole = i m ’ (A.91) 


to be compared with eqs. (6.38) and (6.40). For a point-like electric dipole, 
eq. (6.53) becomes 


a d(x), (A.92) 


while, for a point-like magnetic dipole, eq. (6.58) becomes 


3(m-ñ)û — m sr m 5@)(x). (A.93) 


B= 73 


Observe that, thanks to the extra factor 1/c in eq. (A.86), there is no explicit 
factor of c in eq. (A.93). 
The mechanical potential for the electric dipole still keeps the form (6.61), 


(UB) aipole(X) = —d-Eext (x) ; (A.94) 
while the electric dipole-dipole interaction (6.92) becomes 


dı-d2 — 3(di-#)(do-#) 
r3 


4 
(UB )dipole—dipole = H 5 di-d26 (r). (A.95) 


Similarly, we still have* 
(Uz )aipote(X) = =m: Bext (x) , (A.96) 
as in eq. (6.98), while eq. (6.105) becomes 


mı:mə — 3(m;-A)(m2-A) 8r 
r3 3 


(UB) aipole—dipole = mi-m2 5°) (r) . (A.97) 


Chapter 8 


In Gaussian units, the four-current j” is still written 
j" = (cp, j) » (A.98) 


as in eq. (8.9). Replacing formally A —> A/c, see eq. (A.8), eq. (8.12) would 
become A“ = (¢/c, A/c). However, it is convenient to get rid of the overall 
1/c factor, and define A” as 


A" = (¢,A). (A.99) 


F” is still given by F#” = 0" A” — ð” A", as in eq. (8.15), where, however, now 
A" is related to the scalar and vector potential as in eq. (A.99). Therefore, 
instead of eq. (8.16), we have 
p = gA aia? 
1 
= OA — 0d, (A.100) 


and therefore, comparing with the expression of E in terms of A and ¢ in the 
Gaussian system, given in eq. (A.34), we get 


F% =F’, (A.101) 


without the factor 1/c that appears in eq. (8.17). For the (ij) components, 
we still get Fi; = €ijx Br, as in eq. (8.20). Then eq. (8.21) becomes 


0 Eı Ez Es 
—Eı 0 B3 — Bə 
-E2 -B3 0 Bı 
— E3 Bo -Bı 0 


FY = (A.102) 


We see that the Gaussian system treats the electric and magnetic fields on 
the same footing as components of F”“”, without an extra factor of 1/c in 
the electric field, which is more natural from the point of view of Lorentz 
covariance. 

In terms of this F””, the first pair of Maxwell’s equation, that in Gaussian 
units have the form (A.9, A.10), becomes 


ƏL F" = -y . (A.103) 


Comparing with eq. (8.23) we see that, in the covariant formalism, the passage 
from SI to Gaussian equations can be formally performed using, together with 
the replacements ceo > 1/(4r), po > 4r/c¢°, already given in eq. (A.7), also 
the replacement 


1 
At > z4“ (A.104) 
(and therefore also F“” — F””/c). This is a consequence of having rescaled 


A” by an overall factor of c compared to the SI definition, as discussed above 
eq. (A.99). Then, eq. (8.28) becomes 


An wy 


AY — 8" (8, A") = -FF , (A.105) 


while the Lorenz gauge condition remains „A“ = 0, so, in the Lorenz gauge, 
eq. (A.105) becomes 


AY = jh. (A.106) 
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The definition (8.31) of F¥” as well as the second pair of Maxwell’s equation 
(8.33), are unchanged. 

The energy-momentum tensor of the electromagnetic field, eq. (8.34), be- 
comes 


1 1 
T” = —— | Pek” +a FPR, |. Al 
p= ( p T 4” p ) (A.107) 
and, in terms of the electric and magnetic fields 
1 
T” = —(E?+B’)= A.l 
gl )=u, (A.108 
Oi Lise pi pk — lei 
T = —e EB =—S', (A.109 
4T Č 


[compare with eqs. (8.36) and (8.37)], where the energy density u and the 
Poynting vector S in the Gaussian system were already given in eqs. (A.26 
and (A.24). Equation (8.39) becomes 


ƏT” = -ipj (A.110 


Its u = 0 component still gives 


ðu +V-S =—-Ej, (A.111 


as in eq. (8.41); however, now u, S and E are the quantities in the Gaussian 
system. The u = i component, instead, becomes 


Ogi l. 
at + O;Ti3 = (2 t xB) z (A.112) 
where g? = T°'/c. Note the extra factor 1/c in front of the current in 


eq. (A.112), compared to eq. (8.44), completely analogous to the extra 1/c 
factor in eq. (A.21) compared to eq. (3.5). 
The transformation of E and B under boosts, eqs. (8.57) and (8.58), become 


v 
E;=E), Ei =7(E1 +7 xBu), (A.113) 
and a 
Bi =By, Bi =7(B1 -7 x E1). (A.114) 
The covariantization of the Lorentz force equation in Gaussian units, eq. (A.22), 
gives poh 
p q ppv 
Tr = P A.11 
dr c Mu ( 5) 


to be compared with eq. (8.62). The interaction action (8.68) becomes 
Sine = 4 / dr u(r) A" aM. (A.116) 


and eq. (8.71) becomes 


L[x(t), v(t)] = —me?4/1— a — got, x(t)] + 4v-Alt,x(t)]. (A.117) 


Equations (8.73) and (8.74) become 


Sm = 3 I de j"(2)Ay(2), (A.118) 


[ads | tt, 3) (4%) + A(t, x)-A(t,x) ,  (A.119) 


and the conjugate momentum P (8.78) becomes 


P=p+ tA, (A.120) 
while the Hamiltonian (8.87) becomes 
H(P,x) = cy (P — qA/c)? + me +g. (A.121) 


In Gaussian units, the Lagrangian density of the free electromagnetic field, 
eq. (8.115) [see also eq. (8.127)], becomes 


1 v 
Lo = -grt E" (A.122) 
1 
= zE - B’), (A.123) 


while, as we already saw in eq. (A.118), the interaction Lagrangian density 


(8.116) becomes? 
1 


Lint = a Anj" : (A.126) 


The frequency at which a charged particle rotates in a magnetic field, eqs. (8.201) 


and (8.202), is given by 


qB qBc 
= = 4 A.12 
yme E’ ( a 
so the cyclotron frequency (8.203) becomes 
yat., (A.128) 
mce 


Chapter 9 


Equation (9.10) remains OA” = 0 also in Gaussian units, and therefore the 
discussion of the solutions goes through without changes. Similarly, also the 
radiation gauge is still defined by the conditions A° = 0, V-A = 0, so all the 
equations of Section 9.3 are unchanged. 

In terms of the electric and magnetic field, the wave solution (9.47, 9.48) 
becomes 


E(t,x) = é&k) xe HEX, (A.129) 
B(t,x) = [kx ê(k)] Ex e T, (A.130) 

and eq. (9.51) becomes 
B(t,x) = k x E(t, x), (A.131) 


so, in particular, now |B| = |E|. Using eqs. (A.26) and (A.27) we find that 
the energy density of an electromagnetic wave is 


|E|? 
t,x) = — A.132 
ult, x) = To, (A-132) 
and the momentum density is 
E g 
= —_k A.133 
Ic’ ( ) 


to be compared with eqs. (9.57) and (9.58). 
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5Observe that we have defined the La- 
grangian density £ so that it gives the 
action upon integration over d*x, see 
eq. (8.101). If, instead, one defines it 
so that 


S= / dtd? x L, 
then eqs. (A.122) and (A.126) become 
1 
Lo = -—— FF”, (A.124) 
167 
and 
Th ys 

Lint = ~Ayj". (A.125) 


č 
Also observe [as already discussed be- 
fore eq. (8.117)] that, if one uses 
the opposite metric signature to ours, 
Lo is still given by eq. (A.124) 
while eq. (A.125) becomes Lint = 
—(1/c)A,j". In this way, we get the 
Lagrangian density given in eq. (12.85) 
of Jackson (1998). 
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Chapter 10 


The Green’s functions are still defined by eq. (10.1), so the solution of eq. (A.106) 
is 


A“ (2) = -= / dn! O(n, xj (a), (A.134) 


which replaces eq. (10.10). The retarded and advanced Green’s functions are 
still given by eq. (10.24). The retarded solution then reads 


H H 1 3 1 JP 
A" (t,x) = Af (t,x) + 7 dx ; (A.135) 
to be compared with eq. (10.34). Using eqs. (A.98) and (A.99), we see that in 
the static limit eq. (A.135) reduces correctly to eqs. (A.46) and (A.57). The 
Liénard—Wiechert potentials (10.54) and (10.55) become 


= q 
o(t,x) = (x -A ; (A.136) 
and 
A(t, x)= (ia) . (A.137) 
R Te’ ret 
For a charge in uniform motion, eqs. (10.88) and (10.89) become 
1— v?/e? R(t, x) 
E(t = - ; A.138 
(ex Tt (w/e) sin? 8/2 R2(t,x) ’ ie) 
Bie = iy x E(t, x). (A.139) 


The electric field of accelerated charges can be written as in eq. (10.103), where 
now E, and Ersaq are given by 


E(t,x) = £ Men (A.140) 
Re 42(1 — Ra-v;/c)3 
Ralts = q [vr x (Ra — vr/c)] x Ra (A.141) 
ane Ra c2(1— Ra-vy/c)3 l 


instead of eqs. (10.104) and (10.105), while eq. (10.108) becomes 
B(t,x) = Ra x E(t, x). (A.142) 


Then, writing B = B, + Braa as in eq. (10.109), 


migo t 7 (A.143) 
c RÈ y2(1 — Ra-v,/c)3 
and 
B.aa(t,x) = 1 q (vrxRa)(vr Ra) + c(1 — Ra vr/c)vrxRa l (A.144) 


c Ra c2(1 — Ra-v,/e)3 


to be compared with eqs. (10.112) and (10.113). To lowest order in v/c and 
in 1/r, the electric field becomes 
E(t,r) ~ -—+-di(t—r/c) (A.145) 


fi x [Â x d(tret)] ; (A.146) 


to be compared with eqs. (10.132) and (10.135), while eq. (10.139) becomes 


B(t,x) = =- = Ô X (trot) . (A.147) 


The power radiated per unit solid angle is 


dP(t;0 1 os 
i a3 |d(tret)|? sin?  , (A.148) 
and its angular integral gives 
25 2 
P(t) = 303 |d (tret )| (A.149) 


which is Larmor’s formula in Gaussian units, compare with eqs. (10.144) and 
(10.148). The relativistic Larmor’s formula (10.156) or (10.161) becomes 


dE 20° 6 | 2 veal) 
= = a“ — A.150 
dtret 3c3 e ( ) 

20° 
= sat (ai + VO) pices (A.151) 
and its covariantization (10.169) becomes 
dPha _ 24? (dp dp” 

dr 3m? ( dr dr p". (A-152) 


The power radiated when the acceleration and the velocity are parallel be- 


comes 23 4 
dP. paratiel (tret ) _ qa sin^ 0 


Al 
dQ 4r (1— Bcos@)> ’ Geis) 
compare with eq. (10.173), while, when they are orthogonal, is 
` 22 - 2 2 
dPe,circ(tret) q'a 1 sin“ 0 cos” @ l (A.154) 
dQ 4rc? (1 — Bcosé@) y? (1 — 8 cos 6)? 


compare with eq. (10.191). Finally, in Gaussian units, the fine structure con- 


stant is written as 5 
e 


a= — 

he’ 
to be compared with eq. (10.226). In the rationalized Gaussian units more 
commonly used in quantum field theory, a = e° /(4rħc), or simply a = e?/(47) 
if one also uses units A = c= 1. 


(A.155) 


Chapter 11 


Equation (11.24), which gives the electric field at order 1/r in a large distance 
expansion, but for arbitrary velocities of the source, becomes 


E(t, x) = 4 | Pax ji(t—r/e+n-x'/c,x’), (A.156) 
while eq. (11.27) becomes, as usual, B = ñ x E. Equation (11.39) becomes 


he Tje x). (A.157) 


The radiated power, again exact in the velocities of the source, is 


2 


d ad , (A.158) 


dtdQ — 4r 


Oi f ajalta’) 
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Recall that, in the translation from SI 
to Gaussian units of equation involving 
the magnetic dipole, one must also take 
into account the extra factor of c be- 
tween the definitions (6.36) and (A.86). 


to be compared with eq. (11.51), and the frequency spectrum, eqs. (11.60), 
(11.65), and (11.72), becomes 


dE 1 x . 
Wide T prg “io wio)? (A.159) 
= zzz Âx] (w, wâ/c) (A.160) 

1 7 " 7 A 
= zzz” (li,wa/o)? - èw wâ). (A161) 


In the low-velocity limit, the multipole expansion of the gauge potentials, 
eqs. (11.126) and (11.127), becomes® 


NN; 


hud; t 
"util 6c? 


o(t,x) = aC . (A.162) 


1 
TE 


Gut r/o)) 


ac r/c) 4 Ğut r/c)ù;j 4 catiy(t—r/e)as| 


The corresponding electric field, eq. (11.130), becomes 


B(x) = 3 [ax (ax d) + Zax (ax G) +A x al , (A163) 
cr 6c iE 
and the radiated power, eq. (11.133), becomes 
De a Dg 
P(t)= d|" 4 Qi; tas A.164 
() Fa | 180c® QijQiy + za ™l loo. oe 


Note that the definition (A.86) of the magnetic dipole in the Gaussian system 
“hides” a factor of 1/c in m, so it is less explicit, but of course still true, 
that the electric quadrupole and magnetic dipole contributions to the power 
are both proportional to 1/c®, compared to the leading electric dipole term, 
which is proportional to 1/ 2. 


Chapter 12 


To 1PN order, the expression for the gauge potentials in the Coulomb gauge, 
given in eqs. (12.31) and (12.37), becomes 


db 
Pasa (t, x) = 5 ; (A.165) 
2- =O] 
_ it vo(t) 
Aajext(t; x) a 2 % -xE (A.166) 
al x — x(t 
ED Ee Lx — x(t] ld) 
b4a b 
so eqs. (12.47) and (12.48) read 
ga(X1,-..,XN) = A (A.167) 
b4a Tab 
1 Kk 
Aa(X1,---;XNjV1,---VN) F 5 > [vo + ab (ab: ve )] . (A-168) 
ab 
bAa 
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The Darwin Lagrangian (12.54, 12.55) reads L = Ly + Lipn where 


Ni 2 1 N N PPA 
In = D gmota pe 
a=1 


(A.169) 
Tab 
a=1 bža 
N fos gil 1X 
L = aUa | 
LEN 8c2 4e2 eee r 
a=1 a=1 ba 


oF Tyaeve + (Pab Va) (ĉa vo)]. (A.170) 
ab 
To 1PN order, the conjugate momentum (12.66) is given by 


Va 1 qaq 
a adb 
P, = (1 + z) MaVa + > [ 


Vo + ab (fav V 
22 “ Tab b T aol ab b)] , 


(A.171) 
which can be written as 


vy qa 
Pa = (1 + z) MaVa + T Aaext [t, xa(t)], (A.172) 


with Ag ext |t, Xa(t)] given by eq. (A.166). The 1PN Hamiltonian (12.71, 12.76) 
becomes 


X p P? LS dot 
a= 2 2Ma ( — ŽL Tab 


a=1 bža 
N N 
1 


[Pa Po + (ab'Pa)(ao'Po)] (A-173) 


qda 
Ma 8m3 c2 j z da Pa,ext — 


Pa-Aa,ext | . (A.174 
2MaC : | Ga 
The energy balance equation (12.140) now reads 


d 


2 9 
—(E E E = —— |d A.1 
Tl N + Eipn + Espn) 303 I, ( 75) 
where 
Ny ILI özi 
2 aqb 
En = > mai >a np (A.176) 
a=1 a=1 b4a 
De 2 
Ei 5PN = sadd, 


(A.177) 
and Eipn is obtained using eq. (A.171) and then collecting the terms 1/2 in 


eq. (A.173). The Abraham-Lorentz-Dirac equation (12.150) becomes 


2 v 
2q uku q 
T a uv a Wa + a ppv 
Maŭa = a5 |" ta | tay + F; 


ext [La (T)]ua,v 3 (A.178) 
while the Abraham-Lorentz equation (12.151) reads 
dVa 2q2 va , 
a = Z + Fext - A.l 
Ma 38 de i ae 
The radiation-reaction four-force is still written as Fi’), = Phat Fenote, Where 
now 
2q dpa,v dpa 
FE = S : k k A.180 
rad 3m3 c5 ( dr dr ai ( ) 
2qa dph 
F Schott = 


ae (A.181) 
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The Landau-Lifshitz form of the ALD equation is 


mù” = T Fikia + Ta | (OFK) wate (A.182) 
Ç C 


ext ext 

qa qa 

a HP pext, o a ext o pY H 
Tma TE pa Ua =F Mac! (Foo ta) (Fea ua,v) uk] i 


where Ta = 2q2/(3mac?). 


Chapter 13 


In Gaussian units eqs. (13.21) and (13.22) remain unchanged because in this 
case we do not have the factors 1/(47eo) neither in the Gaussian versions of 
eqs. (13.18) and (13.20), nor in the Gaussian version of eq. (11.151), so we 
still have 


Ppoi(t, x) = —V-P(t,x), Opol = Nis P(t, x). (A.183) 


Similarly, eq. (13.27) still holds while, reproducing the steps in eqs. (13.29)— 
(13.32) taking into account the definition (A.86) of the magnetic dipole (that, 
as we already remarked, has a different factor of c with respect to the SI 
system), eqs. (13.32) and (13.33) become 


jmag(t,x) = cV x M(t,x), Kmag(t, x) = CM(t,x) xn . (A.184 
The displacement vector is now defined as 
D=E+47P, (A.185 


to be compared with eq. (13.40), while the H field is defined as 


H=B-—4rM, (A.186 


to be compared with eq. (13.43). In Gaussian units, the full set of Maxwell’s 
equations in material media then reads 


VD = 4TPpre, (A.187) 
10D Ar. 

v x H- z OE = “dire ; (A.188) 
VB = 0, (A.189) 
10B 


to be compared with eqs. (13.45)—(13.48). We observe that we can pass from 
Maxwell’s equations in the SI system to Maxwell’s equations in the Gaussian 
system extending the formal replacements (A.7) as follows: 


1 An B 
cot n ho = a2 E> E B>. (A.191) 
iL c 
D => —D, H > —H, (A.192) 
AT AT 


while leaving pfree and jfree unchanged. Furthermore, supplementing these 
replacements with 
PP, M> cM, (A.193) 


we see that, taking into account eqs. (A.191) and (A.192), eqs. (13.40) and 
(13.43) go into eqs. (A.185) and (A.186), respectively. We stress, once again, 
these these are just formal rules to get a quick translation of formulas from 


SI to Gaussian units. The actual relations between quantities in the two 
systems are given by eqs. (2.25), (2.32), and (2.33). Since the electric dipole 
is proportional to the electric charge, eq. (2.25) also implies that 


1 


Pau = Pa. A.194 
sou Jano ene 
and therefore,” 
4 
Deau = Der. (A.195) 
0 


Similarly, since the magnetic dipole is proportional to the current, and there- 
fore to the charge, using again eq. (2.25), and furthermore including the extra 
factor of 1/c between eqs. (6.36) and (A.86), we get 


l- d 


Ho 
Meau = Msı = Msı, A.196 
g Jane q Msb ( ) 
and then®:? 
Heau = y 4r uo Hsr. (A.197) 
The equations governing the electrostatics of materials are 
V-D = 4tPtree , VxD=4rV xP, (A.198) 


compare with eq. (13.50), and therefore eqs. (13.51) and (13.52) become, re- 
spectively, 


D(x) = -V (/ Ba! Pinel) +x (J Bar! wae) , (A199) 


|x —x’| |x — x’| 


The equations governing the magnetostatics of materials are 


E(x) = — 


VH=-4VM, VxH= Tjee, (A.201) 
C 


so the general solution for H and B are 


H(x) = yx (/ Ba! pen) +v (/ Ba! (wae) ` (A.202) 


|x —x’| |x — x’| 


B(x) = yx (/ da et) +V (/ Be Tma) +4rM(x), 


|x- x'| |x- x'| 
(A.203) 
compare with eqs. (13.55) and (13.56). The boundary conditions (13.63) and 
(13.70) become, respectively, 


A- (Də — D1) = 4nore, ñ x (Hə — Hı) = T Kres . (A.204) 
For a linear dielectric, the electric susceptibility Xe is defined by 
P(t, x) = xeE(t,x), (A.205) 
to be compared with eq. (13.71). Then, from eq. (A.185), 
D(t,x) = eH(t,x), (A.206) 
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Explicitly, 
Degau = Egau + AnP gau 
An 
= vyA4reo Es] + P 
TEQ LUST Jino SI 
An 
= 4/— (coEsr + Psi) 
€0 
An 
= — Ds. 
€0 
8 s 
Explicitly, 
Hgau = Beau — 4rMgau 
4 
= d Bg — 4r o Msı 
Ho T 
= yA4Tpo (= Bs: = Msı) 
= yA4nrpo Hsr. 
From Maxwell’s equations 


eqs. (A.187)—(A.190), and eqs. (A.185) 
and (A.186), we see that in the Gaus- 
sian system, E,D,P,B,H, and M all 
have the same dimensions. In contrast, 
in the SI system we already saw 
after eq. (2.33) that, dimensionally, 
[Bsr] = [Es1]/[v], i.e., there is an extra 
factor of 1/c in Bsr. Similarly, we 
see from eq. (13.40) that in the SI 
system D has the same dimensions 
as P, but these are different from the 
dimensions of E because there is an 
extra factor of co and, from eq. (13.43), 
in the SI system H has the same 
dimensions as M, but the relation with 
the dimensions of B involve a factor 


1/po. 
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10 As we saw in Note 49 on page 96, in 
the SI system conductivities are mea- 
sured in siemens per meter (S/m); us- 
ing eq. (4.188) and comparing with the 
dimensions of €9, given in eqs. (2.12) 
and (2.13), we see that ogau has dimen- 


sions of inverse time, so is measured in 
=Í 
aa 


11 Observe that, for the bound elec- 
trons, in Gaussian system eq. (A.207) 
holds so, explicitly writing the label 
“gau,” eq. (A.217) reads 


€gau(w) = 1 + 4T (Xe)gau (w) 
4 
piaule) 7 (A.218) 
w 
while, from eqs. (13.90) and (13.73) 


esi(w)/eo = 1+ (Xe)s1(w) 
pA . (A-219) 


Using eqs. (A.211) and (A.214) we get 
back eq. (A.208), confirming the consis- 
tency of these relations. 


where the dielectric constant € is given by 


e= 1 +4TXe, (A.207) 


to be compared with eq. (13.73). Note that, in the Gaussian system, there is 
no distinction between the permittivity and the dielectric constant (or relative 
permittivity). Writing eq. (13.72) as Dsr = es1Es1 and eq. (A.206) as Dgau = 
€gauEgau, and using eqs. (2.32) and (A.195) we see that 


(A.208) 


est/€0 = €gau; 


SO €gau is the same as the relative permittivity, or dielectric constant, of the 
SI system. Then, comparing eqs. (13.73) and (A.207), written as 


(A.209) 
(A.210) 


€o[1 + (Xe)s1] , 
1 F AT (Xe)gau , 


ESI 


Egau = 


we see that Xe is dimensionless both in the SI and Gaussian systems but the 
respective numerical values are different, 


(Xe)sI = An (Xe)gau . (A.211) 
In the Gaussian system, eq. (13.76) becomes 
poa(t,x) = —A£V-D(t,x) 
ATX 
= ~~], y_., Mfree t, : A.212 
tag eR eer) 


We now turn to conductive matter. In the SI system the conductivity is 
defined by Ohm’s law (4.183), jst = og1Esr. Similarly, in the Gaussian system 
it is defined by jgau = OgauEgau. Combining eq. (2.32) with 


1 
j ài = | —— j ; A.213 
Jg Jaa ( ) 
which follows from eq. (2.25), we obtain’? 

_ Ost 
Taa = Fo (A.214) 

For metals, eqs. (13.88) and (13.89) become 
e(w)V-E(w,x) = 0 (A.215) 
V x (w, x) + “ ew)EWw,x) = 0, (A.216) 

where 4 
elw) = elw) +i ee, (A.217) 


to be compared with eq. (13.90). 
We next consider magnetic matter. In Gaussian units the magnetic suscep- 
tibility Xm and the magnetic permeability u are still defined from 


M=ymnH, B=unzH, (A.220) 


as in eqs. (13.91) and (13.92). Then, from eq. (A.186), in Gaussian units 


B=14+41xn, (A.221) 


to be compared with eq. (13.93). Proceeding as in the derivation of eq. (A.211), 
we obtain 


(Xm)st = 4m(Xm gon: (A.222) 
Energy conservation in materials reads 
1 3 oD OB 3 n c 
— jJd E — +H — d’ x E-jfree = —— ds. (E x H), 
4r Jy a att m+ if ils 4n woe ) 
(A.223) 


to be compared with eq. (13.95). The term on the right-hand side gives the 
generalization of the Poynting vector to material media, 


C 
S= —ExH. (A.224) 


For a simple linear medium with D = cE and B = wH, the energy density of 
the electromagnetic field is 
1 
us (cE? + pH?) , (A.225) 


T 
to be compared with eq. (13.98). The Larmor frequency (13.126) becomes 
eB 


= A.22 
OS a ( 6) 


and eq. (13.132) becomes xm = —ne?(r7.)/4mec’. 


Chapter 14 


In Gaussian units, the relation between the dielectric constant and the electric 
susceptibility is given by eq. (A.207), so its frequency dependent generaliza- 
tion, which in SI units is given by eq. (14.10), becomes 


e(w) = 1+ 4rxXe(w), (A.227) 

while eq. (14.12) becomes 
P(w) = xe(w)E(w). (A.228) 

Equation (14.28) becomes 
D(t) = E(t) +47 f dt'xelt— t')E(t’). (A.229) 


The plasma frequency (14.38) is now given by 


Arne? 
ey (A.230) 


Me 


W. 


and, in the Drude-Lorentz model, we have 


2 

npe 1 
e(w) = : A.231 
Xe(w) Me we — w? — iwyo a) 

and therefore 
e(w) =1+ ——2 —__.. (A.232) 

wô — w? — iwyo 
The conductivity in the Drude model is given by 
2 

1 

o(w) = fF (A.233) 


Me 1l—iwr’ 
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!2Recall that, in Gaussian units, €(w) 
and p(w) are dimensionless, see e.g., 
eqs. (A.210) and (A.221), so n(w) re- 
mains dimensionless also in Gaussian 
units. 


which is formally the same as eq. (14.58). Note that this is consistent with 
eq. (A.214), given the relation between the charges in SI and Gaussian units, 
eq. (A.2). 

The dielectric function of metals, which in SI units is given in eq. (14.76), 
is still formally given by 


N 
=14 : A.234 
Wp 22 w? — L iwyi ( ) 

where, however, now we = Anne? /™Me. 


Chapter 15 


Searching for a monochromatic wave solutions of the form (15.1), Maxwell’s 
equations in material media, eqs. (A.187)—(A.190), give 


e(w)k-E(w,k) = 0, (A.235 
kx B(w,k) = —=n?(w)B(w,k) (A.236 
k-BWw,k) = 0, (A.237 
kx E(w,k) = = B(w,k), (A.238 
where the refraction index n(w) is now defined from’? 
n(w) = Voua). (A.239 


The dispersion relation (15.11) remains unchanged, and therefore also eqs. (15.18) 
and (15.19), while eq. (15.16) is replaced by 


By = n(w) k x Ex. (A.240) 
The dispersion relation (15.39) becomes 

we(w) p(w) = k’. (A.241) 
The expression (15.43) for the skin depth becomes 


Cc 
dskin(w) = ———. ; A.242 


The equations of Section 15.4.1 for the propagation in waveguides are trans- 
formed into Gaussian units with the usual replacement B > B/c. 


Chapter 16 
Most results of this chapter are independent of whether one uses SI or Gaussian 
units, except eqs. (16.18) and (16.19) that in Gaussian units read, respectively 
dP(t;0) 1 ŻE? 

dQ 4r m2 


and I = cE? /4r. Therefore eq. (16.20) still holds, with the classical electron 
radius defined as 


sin? 0, (A.243) 


e 
= : A.244 
n= ze ( ) 
Similarly, eqs. (16.41) and (16.42) become 
dP(6) 1 (e?Eo\” wt 6 
= 0 A.245 
dQ 8x3 ( Me (w? — we)? + wy eter we ( ) 


and (I(t)) = cE? /4r. so eq. (16.44) still holds, again with the classical electron 
radius given by eq. (A.244). 
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Galilean Relativity, 156 
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Gauge transformation, 53, 185 
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Gauss’s law, 40 
integrated form, 42 
Gauss’s theorem, 7 
Gaussian units, 31-37, 411—430 
rationalized, 34, 411 
Gradient 
in cylindrical coordinates, 4 
in polar coordinates, 4 
Green’s function 
advanced, 238 
d’Alembertian, 235-241, 336-338 
Helmholtz operator, 238 
Laplacian, 60 
method, 59 
retarded, 238 
Green’s identities, 64 
Green’s reciprocity relation, 130 
Group velocity, 385 
Group, definition, 19 
Gyromagnetic ratio, 153 


H field, 352 

in Gaussian units, 426 
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Helmholtz operator, 237 
henry (H), SI unit of inductance, 98, 113 
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Line integral, 5 
Lorentz boost, 160 
Lorentz contraction, 165 
Lorentz force, 30, 40 
covariant form, 193 
in Gaussian units, 33 
relativistic, 40, 47, 193 
Lorentz group, 166-177 
Lorentz transformations, 159-162 
infinitesimal, 172 
of electromagnetic fields, 190-191 
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in Gaussian units, 418 
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Magnetic dipole radiation, 285-289 
Magnetic field 
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Magnetization, 350 
Magnetization current, 350 
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energy, 109-112 
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Maxwell’s equations, 30, 40, 41 
covariant form, 186-188 
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in material media, 351-352 
in Gaussian units, 426 
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Maxwell’s stress tensor, 49 
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Metals, 358-359 
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radiative, 283 
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Neumann boundary conditions, 65 
Noether theorem, 205-207 
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and energy-momentum tensor, 208-209 
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Ohm’s law, 95, 358 
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Paramagnetism, 360 
Parity transformation, 55 
Parseval theorem, 276 
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of vacuum, 28-29 


Photon mass, limits, 15 
Plancherel theorem, 276 
Plasma frequency, 371, 378, 379 
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Point particles 
charge density, 44 
current density, 44 
interaction with electromagnetic field, 
194 
relativistic action, 180 
Poisson’s equation, 55, 57, 414 
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Polarization of light 
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Polarization vector of EM waves 
longitudinal, 221 
transverse, 220 
Polarization vector of materials (P), 
348-349 
relation between SI and Gaussian, 427 
Post-Newtonian expansion, 295-312 
1.5PN order, 323-328 
1PN Hamiltonian, 308-310 
1PN order, 298-310 
Poynting vector, 45 
in materials, 360, 429 
Poynting’s theorem, 45 
Proper time, 163 


Quadrupole moment, 136 
energy in external electric field, 146 
reduced, 136 


time-dependent, 280 
Quadrupole radiation, 285-289 


Radiated power 
electric dipole, 285 
for arbitrary source velocities, 275-277 
Radiation gauge, 223 
Radiation reaction, 312-343 
Rapidity, 160, 182 
Refraction index, 382, 430 
Regularization of divergences, 105 
Relative permittivity, 356 
Representations of groups, 19 
equivalent, 20 
reducible, 21—24 
Resistance, 96 
Resistivity, 96 
Resonance, 408 
Resonant scattering, 408 
Retarded time, 242 


Schott term, 331 
Self-force, 312-343 
SI units, 27-31 
siemens, 96 
Skin depth, 388 
Skin effect, 388-391 
anomalous, 390 
Space-like interval, 157 
Spontaneous symmetry breaking, 86 
statcoulomb, 32 
STF tensors, 136 
Stokes’s theorem, 5 
Superposition principle, 89, 100 
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Synchrotron, 264 
Synchrotron radiation, 264 


Tensors, 16-18 
decomposition in irreducible 
representations, 23—24 
Theorem for curl-free fields, 7 
Theorem for divergence-free fields, 7 
Thomas—Reiche—Kuhn sum rule, 371 
Thomson cross-section, 404 
Time dilatation, 163 
Time reversal, 56 
Time-like interval, 157 
Torque 
electric dipole in external field, 146 
electric quadrupole in external field, 
146 
magnetic dipole in external field, 151 
Transverse-longitudinal decomposition of 
vector fields, 292 


Vector potential, 52 
volt, 30 


Wave equation, 217-219 

Wavelength, 225 
UV/visible/IR/microwaves, 388 

Wavenumber, 225 

Work done by Lorentz force, 47, 193 

World-line of a particle, 178 


Yukawa potential, 15 


